Files
flamingo-tech-test/docs/architecture-design-company-inc.md
Andriy Oblivantsev d5b2bd2aa4
Helm Chart CI & Release / Lint Helm Chart (push) Successful in 9s
Helm Chart CI & Release / Release Helm Chart (push) Has been skipped
Update docs: Mermaid diagrams, current verification state
- Replace ASCII art with Mermaid in architecture-design-company-inc.md
- Rewrite architecture-hld.md with 3 Mermaid diagrams (infra, CI/CD, security)
- Remove draw.io/Lucidchart references
- Update verification-log.md with current passing state

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-19 19:45:47 +00:00

6.2 KiB
Raw Blame History

Architectural Design Document: Company Inc.

Cloud Infrastructure for Web Application Deployment
Version: 1.0
Date: February 2026


1. Executive Summary

This document outlines a robust, scalable, secure, and cost-effective infrastructure design for Company Inc., a startup deploying a web application with a Python/Flask REST API backend, React SPA frontend, and MongoDB database. The design leverages Google Cloud Platform (GCP) with GKE (Google Kubernetes Engine) as the primary compute platform.

Key Design Principles: Security-by-default, scalability from day one, cost optimization for early stage, and GitOps-based operations.


2. Cloud Provider and Environment Structure

2.1 Provider Choice: GCP

Rationale: GCP offers strong managed Kubernetes (GKE) with autopilot options, excellent MongoDB Atlas integration (or GCP-native DocumentDB alternatives), competitive pricing for startups, and simplified networking. GKE Autopilot reduces operational overhead for a small team with limited Kubernetes expertise.

2.2 Multi-Project Structure

Project Purpose Isolation
company-inc-prod Production workloads High; sensitive data
company-inc-staging Staging / pre-production Medium
company-inc-shared CI/CD, shared tooling, DNS Low; no PII
company-inc-sandbox Dev experimentation Lowest

Benefits:

  • Billing separation per environment
  • Blast-radius containment (prod issues do not affect staging)
  • IAM and network isolation
  • Aligns with GCP best practices for multi-tenant or multi-env setups

3. Network Design

3.1 VPC Architecture

  • One VPC per project (or Shared VPC from company-inc-shared for centralised control)
  • Regional subnets in at least 2 zones for HA
  • Private subnets for workloads (no public IPs on nodes)
  • Public subnets only for load balancers and NAT gateways

3.2 Security Layers

Layer Controls
VPC Firewall Default deny; allow only required CIDRs and ports
GKE node pools Private nodes; no public IPs
Security groups Kubernetes Network Policies + GKE-native security
Ingress HTTPS only; TLS termination at load balancer
Egress Cloud NAT for outbound; restrict to necessary destinations

3.3 Network Topology (High-Level)

flowchart TD
    Internet((Internet))
    Internet --> LB[Cloud Load Balancer<br/>HTTPS termination]
    LB --> Ingress[GKE Ingress Controller]

    subgraph VPC["VPC — Private Subnets"]
        Ingress --> API[API Pods<br/>Python / Flask]
        Ingress --> SPA[Frontend Pods<br/>React SPA]
        API --> DB[(MongoDB<br/>Private Endpoint)]
    end

4. Compute Platform: GKE

4.1 Cluster Strategy

  • GKE Autopilot for production and staging to minimise node management
  • Single regional cluster per environment initially; consider multi-region as scale demands
  • Private cluster with no public endpoint; access via IAP or Bastion if needed

4.2 Node Configuration

Setting Initial Growth Phase
Node type Autopilot (no manual sizing) Same
Min nodes 0 (scale to zero when idle) 2
Max nodes 5 50+
Scaling Pod-based (HPA, cluster autoscaler) Same

4.3 Workload Layout

  • Backend (Python/Flask): Deployment with HPA (CPU/memory); target 23 replicas initially
  • Frontend (React): Static assets served via CDN or container; 12 replicas
  • Ingress: GKE Ingress for HTTP(S) routing; consider GKE Gateway API for advanced use

4.4 Containerisation and CI/CD

Aspect Approach
Image build Dockerfile per service; multi-stage builds; non-root user
Registry Artifact Registry (GCR) in company-inc-shared
CI GitHub Actions (or GitLab CI) — build, test, security scan
CD ArgoCD or Flux — GitOps; app of apps pattern
Secrets External Secrets Operator + GCP Secret Manager

5. Database: MongoDB

5.1 Service Choice

MongoDB Atlas (or Google Cloud DocumentDB if strict GCP-only) recommended for:

  • Fully managed, automated backups
  • Multi-region replication
  • Strong security (encryption at rest, VPC peering)
  • Easy scaling

Atlas on GCP provides native VPC peering and private connectivity.

5.2 High Availability and DR

Topic Strategy
Replicas 3-node replica set; multi-AZ
Backups Continuous backup; point-in-time recovery
Disaster recovery Cross-region replica (e.g. us-central1 + europe-west1)
Restore testing Quarterly DR drills

5.3 Security

  • Private endpoint (no public IP)
  • TLS for all connections
  • IAM-based access; principle of least privilege
  • Encryption at rest (default in Atlas)

6. High-Level Architecture Diagram

flowchart TB
    Users((Users))

    Users --> CDN[Cloud CDN<br/>Static Assets]
    Users --> LB[Cloud Load Balancer<br/>HTTPS]

    subgraph GKE["GKE Cluster — Private"]
        LB --> Ingress[Ingress Controller]
        Ingress --> API[Backend — Flask<br/>HPA 23 replicas]
        Ingress --> SPA[Frontend — React SPA<br/>Nginx]
        CDN --> SPA
        API --> Redis[Redis<br/>Memorystore]
        API --> Obs[Observability<br/>Prometheus / Grafana]
    end

    subgraph Data["Managed Services"]
        Mongo[(MongoDB Atlas<br/>Replica Set · Private Endpoint)]
        Secrets[Secret Manager<br/>App & DB credentials]
        Registry[Artifact Registry<br/>Container images]
    end

    API --> Mongo
    API --> Secrets
    GKE --> Registry

7. Summary of Recommendations

Area Recommendation
Cloud GCP with 4 projects (prod, staging, shared, sandbox)
Compute GKE Autopilot, private nodes, HPA
Database MongoDB Atlas on GCP with multi-AZ, automated backups
CI/CD GitHub Actions + ArgoCD/Flux
Security Private VPC, TLS everywhere, Secret Manager, least privilege
Cost Start small; use committed use discounts as usage grows

See architecture-hld.md for the standalone HLD diagram.