# Deployment Architecture

TruePortAI is deployed as a set of independently scalable services across **AWS** (primary cloud) and **On-Premise/VPC** (for the ML analytics engine). All services are stateless; state is held in MongoDB Atlas and Redis.

---

## Deployment Topology

```mermaid
graph TB
    subgraph "Internet"
        USER["End Users / Clients"]
    end

    subgraph "AWS — us-east-1 (Primary)"
        subgraph "Edge Layer"
            CF["AWS CloudFront\nGlobal CDN · TLS Termination\nDDoS protection via AWS Shield"]
        end

        subgraph "Static Assets (S3)"
            S3UI["S3: platform-ui\nAngular SPA"]
            S3TPUI["S3: trueportai-ui\nAngular SPA"]
            S3WWW["S3: trueportai-www\nNext.js Static Export"]
            S3DOCS["S3: docs\nSphinx HTML"]
        end

        subgraph "Serverless Compute (Lambda)"
            LPB["Lambda: platform-backend\nFastAPI via Mangum\npython3.11 runtime\n512MB · 30s timeout"]
            LTS["Lambda: trueportai-services\nFastAPI via Mangum\npython3.11 runtime\n512MB · 30s timeout"]
        end

        subgraph "API Gateway"
            APIGW["AWS API Gateway\nHTTP API\nRoutes: /platform/* · /trueportai-services/*"]
        end

        subgraph "Data"
            REDIS["Amazon ElastiCache (Redis)\nRate limits · Circuit breaker state\nt3.medium cluster"]
            S3LOG["S3: Tenant Log Buckets\n{tenant-slug}-audit-logs\nServer-side encryption (SSE-S3)"]
        end
    end

    subgraph "MongoDB Atlas (M10+)"
        ATLAS["Atlas Cluster\nsaas_platform_core DB\ntrueport_ai DB\nAuto-scaling · Backups enabled"]
    end

    subgraph "On-Premise / Private VPC"
        subgraph "Analytics Engine"
            ANLSVC["Analytics Service\nPython 3.11\nFastAPI HTTP trigger"]
            TRITON["NVIDIA Triton Inference Server\nGPU Node (A100 / RTX 4090)"]
            subgraph "Loaded ML Models"
                M1["RoBERTa NER\n(PII/PHI Detection)"]
                M2["DeBERTa-v3\n(Bias/Toxicity)"]
                M3["DistilBERT\n(Injection Shield)"]
                M4["Regex + Entropy\n(Exfil Guard)"]
            end
        end
    end

    subgraph "External Services"
        SES["AWS SES\nTransactional Email"]
        STRIPE["Stripe\nBilling / Subscriptions"]
        GH_ACTIONS["GitHub Actions\nCI/CD Pipelines"]
    end

    USER --> CF
    CF --> S3UI & S3TPUI & S3WWW & S3DOCS
    CF --> APIGW
    APIGW --> LPB
    APIGW --> LTS

    LPB --> ATLAS
    LTS --> ATLAS
    LTS --> REDIS
    LTS --> S3LOG

    S3LOG -->|S3 Event Notification| ANLSVC
    ANLSVC --> TRITON
    TRITON --> M1 & M2 & M3 & M4
    ANLSVC --> ATLAS

    LPB --> SES
    LPB --> STRIPE

    GH_ACTIONS --> LPB & LTS & S3UI & S3TPUI

    style CF fill:#FF9500,color:#fff
    style ATLAS fill:#50C878,color:#fff
    style TRITON fill:#FF6B6B,color:#fff
    style REDIS fill:#DC382D,color:#fff
```

---

## Lambda Function Configuration

### `platform-backend` Lambda

| Setting | Value |
|---------|-------|
| Runtime | `python3.11` |
| Handler | `zappa_entry.handler` (Mangum/Zappa) |
| Memory | 512 MB |
| Timeout | 30 seconds |
| Concurrency | 100 (reserved) |
| VPC | No (public subnet — Atlas IP allowlist) |
| Environment | `MONGO_URI`, `SECRET_KEY`, `SMTP_*`, `CORS_ORIGIN_REGEX` |

### `trueportai-services` Lambda

| Setting | Value |
|---------|-------|
| Runtime | `python3.11` |
| Handler | `zappa_entry.handler` (Mangum/Zappa) |
| Memory | 512 MB |
| Timeout | 30 seconds |
| Concurrency | 200 (reserved) |
| VPC | Yes (private subnet with ElastiCache access) |
| Environment | `MONGO_URI`, `REDIS_URL`, `SECRET_KEY` |

---

## CloudFront Distribution

```mermaid
graph LR
    CF["CloudFront Distribution\ndomain: *.trueportai.com"]

    CF -->|"/*.* (static assets)"| S3["S3 Origins\n(SPA / WWW / Docs)"]
    CF -->|"/platform/api/*"| LPB["Lambda: platform-backend\nvia API Gateway"]
    CF -->|"/trueportai-services/*"| LTS["Lambda: trueportai-services\nvia API Gateway"]

    subgraph "CloudFront Behaviors"
        B1["Default /* → trueportai-www S3"]
        B2["/app/* → trueportai-ui S3"]
        B3["/platform/app/* → platform-ui S3"]
        B4["/platform/api/* → API Gateway (platform-backend)"]
        B5["/trueportai-services/* → API Gateway (trueportai-services)"]
        B6["/docs/* → S3 docs bucket"]
    end

    CF --- B1 & B2 & B3 & B4 & B5 & B6
```

**CloudFront Settings**:
- **SSL/TLS**: ACM certificate — `*.trueportai.com`
- **HTTP/2**: Enabled
- **Compression**: Gzip + Brotli
- **Cache Policy**: TTL 0 for API paths; 86400s for static assets
- **Security Headers**: HSTS, X-Frame-Options, CSP via Lambda@Edge

---

## CI/CD Pipeline

```mermaid
flowchart LR
    subgraph "GitHub"
        PR[Pull Request] --> CI
        CI[GitHub Actions CI]
        CI -->|"pytest + linting"| PASS{Tests Pass?}
        PASS -->|No| FAIL[Block Merge]
        PASS -->|Yes| MERGE[Merge to main]
        MERGE --> CD[GitHub Actions CD]
    end

    subgraph "Deployment"
        CD -->|"zappa update production"| LPB[Lambda: platform-backend]
        CD -->|"zappa update production"| LTS[Lambda: trueportai-services]
        CD -->|"ng build + aws s3 sync"| S3UI[S3: platform-ui]
        CD -->|"ng build + aws s3 sync"| S3TPUI[S3: trueportai-ui]
        CD -->|"next build + aws s3 sync"| S3WWW[S3: trueportai-www]
        LPB & LTS & S3UI & S3TPUI & S3WWW --> CF[CloudFront Invalidation]
    end
```

---

## Analytics Engine — On-Premise Deployment

```mermaid
graph TB
    subgraph "Private VPC / On-Premise Server"
        subgraph "Python Service Layer"
            API["FastAPI Trigger Service\nPORT 8080\nReceives S3 event webhook"]
            QUEUE["In-Process Job Queue\nasyncio task queue"]
            PIPE["ML Pipeline Orchestrator\n(async, parallel model invocation)"]
        end

        subgraph "NVIDIA Triton Inference Server"
            direction LR
            TRT["Triton HTTP API\nPORT 8000 (HTTP)\nPORT 8001 (gRPC)"]
            M1["Model: pii-ner-roberta\nPyTorch Backend"]
            M2["Model: bias-deberta-v3\nPyTorch Backend"]
            M3["Model: injection-distilbert\nPyTorch Backend"]
            M4["Model: exfil-regex-entropy\nPython Backend"]
            TRT --> M1 & M2 & M3 & M4
        end

        subgraph "Data Access"
            S3CL["boto3 S3 Client"]
            MGCL["Motor MongoDB Client"]
        end
    end

    subgraph "External"
        S3["AWS S3 Bucket\n{tenant-slug}-audit-logs"]
        ATLAS["MongoDB Atlas\nviolations collection"]
        SMTP["SMTP / SES\nAlert emails"]
    end

    S3 -->|"S3 Event → HTTPS webhook"| API
    API --> QUEUE
    QUEUE --> PIPE
    PIPE --> S3CL
    S3CL --> S3
    PIPE --> TRT
    PIPE --> MGCL
    MGCL --> ATLAS
    PIPE --> SMTP
```

### Model Repository Layout (Triton)

```
/models/
├── pii-ner-roberta/
│   ├── config.pbtxt          # Triton model config
│   └── 1/
│       └── model.pt          # Serialized PyTorch model
├── bias-deberta-v3/
│   ├── config.pbtxt
│   └── 1/
│       └── model.pt
├── injection-distilbert/
│   ├── config.pbtxt
│   └── 1/
│       └── model.pt
└── exfil-regex-entropy/
    ├── config.pbtxt
    └── 1/
        └── model.py          # Python script backend
```

---

## MongoDB Atlas Configuration

```mermaid
graph TD
    subgraph "Atlas M10+ Cluster"
        PRIMARY["Primary Node\nRead + Write"]
        SEC1["Secondary Node 1\nRead Replica"]
        SEC2["Secondary Node 2\nRead Replica"]
        PRIMARY --> SEC1 & SEC2
    end

    subgraph "Databases"
        CORE["saas_platform_core\nGlobal platform data"]
        TP["trueport_ai\nPer-tenant gateway data"]
    end

    subgraph "Indexes"
        IDX1["users: email (unique)"]
        IDX2["tenants: slug (unique)"]
        IDX3["api_keys: key (unique)"]
        IDX4["usage_logs: timestamp (desc)"]
        IDX5["usage_logs: api_key + timestamp"]
        IDX6["violations: tenant_id + detected_at"]
        IDX7["otps: expires_at (TTL index)"]
    end

    PRIMARY --> CORE & TP
    CORE & TP --> IDX1 & IDX2 & IDX3 & IDX4 & IDX5 & IDX6 & IDX7
```

**Atlas Settings**:
- **Tier**: M10 (production), M2 (staging/dev)
- **Region**: `us-east-1` (primary), `eu-west-1` (replica for EU customers)
- **Backup**: Continuous cloud backup enabled
- **Encryption**: Encryption-at-rest enabled
- **Network**: IP Access List — only Lambda NAT gateway IPs and analytics engine IPs

---

## Environment Configuration

| Variable | Service | Description |
|----------|---------|-------------|
| `MONGO_URI` | Both | MongoDB Atlas connection string |
| `MONGO_DB_NAME` | platform-backend | `saas_platform_core` |
| `MONGO_DB_NAME` | trueportai-services | `trueport_ai` |
| `SECRET_KEY` | Both | HS256 JWT signing key (32+ bytes random) |
| `REDIS_URL` | trueportai-services | `redis://:{password}@{host}:6379/0` |
| `SMTP_HOST_URL` | platform-backend | SMTP server hostname |
| `SMTP_HOST_PORT` | platform-backend | `587` (STARTTLS) |
| `SMTP_HOST_UID` | platform-backend | SMTP username |
| `SMTP_HOST_PWD` | platform-backend | SMTP password |
| `CORS_ORIGIN_REGEX` | platform-backend | Allowed origin pattern |
| `STORAGE_PROVIDER` | platform-backend | `S3`, `AZURE`, `GCP`, or `LOCAL` |
| `AWS_ACCESS_KEY_ID` | platform-backend | IAM key for S3 access |
| `AWS_SECRET_ACCESS_KEY` | platform-backend | IAM secret for S3 access |
| `AWS_S3_BUCKET` | platform-backend | Tenant log archive bucket |
| `AWS_REGION` | platform-backend | `us-east-1` |

---

## Scaling Strategy

| Service | Scale Trigger | Mechanism |
|---------|--------------|-----------|
| platform-backend | Request volume | Lambda auto-concurrency |
| trueportai-services | Request volume | Lambda auto-concurrency (max 1000) |
| MongoDB | Data volume / IOPS | Atlas auto-scaling |
| Redis (ElastiCache) | Memory usage | Manual tier upgrade |
| Analytics Engine | Processing queue depth | Horizontal — add GPU nodes |
| CloudFront | Automatic | AWS-managed globally |

---

## Disaster Recovery

| Component | RPO | RTO | Strategy |
|-----------|-----|-----|----------|
| MongoDB Atlas | 0s | < 5 min | Replica set auto-failover |
| Lambda | 0s | < 1 min | Multi-AZ by default |
| S3 Logs | 0s | N/A | Multi-AZ durable storage |
| Analytics Engine | 1h | 4h | Manual failover to backup node |
| Redis | Minutes | < 15 min | ElastiCache Multi-AZ |