204 lines
7.2 KiB
Markdown
204 lines
7.2 KiB
Markdown
# Architectural Design Document: Company Inc.
|
||
|
||
**Cloud Infrastructure for Web Application Deployment**
|
||
**Version:** 1.0
|
||
**Date:** February 2026
|
||
|
||
---
|
||
|
||
## 1. Executive Summary
|
||
|
||
This document outlines a robust, scalable, secure, and cost-effective infrastructure design for Company Inc., a startup deploying a web application with a Python/Flask REST API backend, React SPA frontend, and MongoDB database. The design leverages **Google Cloud Platform (GCP)** with **GKE (Google Kubernetes Engine)** as the primary compute platform.
|
||
|
||
**Key Design Principles:** Security-by-default, scalability from day one, cost optimization for early stage, and GitOps-based operations.
|
||
|
||
---
|
||
|
||
## 2. Cloud Provider and Environment Structure
|
||
|
||
### 2.1 Provider Choice: GCP
|
||
|
||
**Rationale:** GCP offers strong managed Kubernetes (GKE) with autopilot options, excellent MongoDB Atlas integration (or GCP-native DocumentDB alternatives), competitive pricing for startups, and simplified networking. GKE Autopilot reduces operational overhead for a small team with limited Kubernetes expertise.
|
||
|
||
### 2.2 Multi-Project Structure
|
||
|
||
| Project | Purpose | Isolation |
|
||
|---------|---------|-----------|
|
||
| **company-inc-prod** | Production workloads | High; sensitive data |
|
||
| **company-inc-staging** | Staging / pre-production | Medium |
|
||
| **company-inc-shared** | CI/CD, shared tooling, DNS | Low; no PII |
|
||
| **company-inc-sandbox** | Dev experimentation | Lowest |
|
||
|
||
**Benefits:**
|
||
- Billing separation per environment
|
||
- Blast-radius containment (prod issues do not affect staging)
|
||
- IAM and network isolation
|
||
- Aligns with GCP best practices for multi-tenant or multi-env setups
|
||
|
||
---
|
||
|
||
## 3. Network Design
|
||
|
||
### 3.1 VPC Architecture
|
||
|
||
- **One VPC per project** (or Shared VPC from `company-inc-shared` for centralised control)
|
||
- **Regional subnets** in at least 2 zones for HA
|
||
- **Private subnets** for workloads (no public IPs on nodes)
|
||
- **Public subnets** only for load balancers and NAT gateways
|
||
|
||
### 3.2 Security Layers
|
||
|
||
| Layer | Controls |
|
||
|-------|----------|
|
||
| **VPC Firewall** | Default deny; allow only required CIDRs and ports |
|
||
| **GKE node pools** | Private nodes; no public IPs |
|
||
| **Security groups** | Kubernetes Network Policies + GKE-native security |
|
||
| **Ingress** | HTTPS only; TLS termination at load balancer |
|
||
| **Egress** | Cloud NAT for outbound; restrict to necessary destinations |
|
||
|
||
### 3.3 Network Topology (High-Level)
|
||
|
||
```
|
||
Internet
|
||
|
|
||
v
|
||
[Cloud Load Balancer] (HTTPS)
|
||
|
|
||
v
|
||
[GKE Ingress Controller]
|
||
|
|
||
v
|
||
[VPC Private Subnets]
|
||
|
|
||
+-- [GKE Cluster - API Pods]
|
||
+-- [GKE Cluster - Frontend Pods]
|
||
|
|
||
v
|
||
[Private connectivity to MongoDB]
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Compute Platform: GKE
|
||
|
||
### 4.1 Cluster Strategy
|
||
|
||
- **GKE Autopilot** for production and staging to minimise node management
|
||
- **Single regional cluster** per environment initially; consider multi-region as scale demands
|
||
- **Private cluster** with no public endpoint; access via IAP or Bastion if needed
|
||
|
||
### 4.2 Node Configuration
|
||
|
||
| Setting | Initial | Growth Phase |
|
||
|---------|---------|--------------|
|
||
| **Node type** | Autopilot (no manual sizing) | Same |
|
||
| **Min nodes** | 0 (scale to zero when idle) | 2 |
|
||
| **Max nodes** | 5 | 50+ |
|
||
| **Scaling** | Pod-based (HPA, cluster autoscaler) | Same |
|
||
|
||
### 4.3 Workload Layout
|
||
|
||
- **Backend (Python/Flask):** Deployment with HPA (CPU/memory); target 2–3 replicas initially
|
||
- **Frontend (React):** Static assets served via CDN or container; 1–2 replicas
|
||
- **Ingress:** GKE Ingress for HTTP(S) routing; consider GKE Gateway API for advanced use
|
||
|
||
### 4.4 Containerisation and CI/CD
|
||
|
||
| Aspect | Approach |
|
||
|-------|----------|
|
||
| **Image build** | Dockerfile per service; multi-stage builds; non-root user |
|
||
| **Registry** | Artifact Registry (GCR) in `company-inc-shared` |
|
||
| **CI** | GitHub Actions (or GitLab CI) — build, test, security scan |
|
||
| **CD** | ArgoCD or Flux — GitOps; app of apps pattern |
|
||
| **Secrets** | External Secrets Operator + GCP Secret Manager |
|
||
|
||
---
|
||
|
||
## 5. Database: MongoDB
|
||
|
||
### 5.1 Service Choice
|
||
|
||
**MongoDB Atlas** (or **Google Cloud DocumentDB** if strict GCP-only) recommended for:
|
||
- Fully managed, automated backups
|
||
- Multi-region replication
|
||
- Strong security (encryption at rest, VPC peering)
|
||
- Easy scaling
|
||
|
||
**Atlas on GCP** provides native VPC peering and private connectivity.
|
||
|
||
### 5.2 High Availability and DR
|
||
|
||
| Topic | Strategy |
|
||
|-------|----------|
|
||
| **Replicas** | 3-node replica set; multi-AZ |
|
||
| **Backups** | Continuous backup; point-in-time recovery |
|
||
| **Disaster recovery** | Cross-region replica (e.g. `us-central1` + `europe-west1`) |
|
||
| **Restore testing** | Quarterly DR drills |
|
||
|
||
### 5.3 Security
|
||
|
||
- Private endpoint (no public IP)
|
||
- TLS for all connections
|
||
- IAM-based access; principle of least privilege
|
||
- Encryption at rest (default in Atlas)
|
||
|
||
---
|
||
|
||
## 6. High-Level Architecture Diagram
|
||
|
||
The following diagram illustrates the main components (implement in draw.io or Lucidchart):
|
||
|
||
```
|
||
+------------------------------------------------------------------+
|
||
| COMPANY INC. INFRASTRUCTURE |
|
||
+------------------------------------------------------------------+
|
||
|
||
[Users]
|
||
|
|
||
v
|
||
+-------------------+ +-------------------+
|
||
| Cloud CDN | | Cloud LB (HTTPS) |
|
||
| (Static Assets) | | (API + SPA) |
|
||
+-------------------+ +-------------------+
|
||
| |
|
||
v v
|
||
+------------------------------------------------------------------+
|
||
| GKE CLUSTER (Private) |
|
||
| +------------------+ +------------------+ +-----------------+ |
|
||
| | Ingress | | Backend (Flask) | | Frontend (SPA) | |
|
||
| | Controller | | - HPA | | - Nginx/React | |
|
||
| +------------------+ +------------------+ +-----------------+ |
|
||
| | | | |
|
||
| +-----------------------+-----------------------+ |
|
||
| | |
|
||
| +------------------+ +------------------+ |
|
||
| | Redis (cache) | | Observability | |
|
||
| | (Memorystore) | | (Prometheus/Grafana) |
|
||
| +------------------+ +------------------+ |
|
||
+------------------------------------------------------------------+
|
||
|
|
||
v
|
||
+------------------------------------------------------------------+
|
||
| MongoDB Atlas (GCP) | Secret Manager | Artifact Registry |
|
||
| - Replica Set | - App secrets | - Container images |
|
||
| - Private endpoint | - DB credentials| |
|
||
+------------------------------------------------------------------+
|
||
```
|
||
|
||
---
|
||
|
||
## 7. Summary of Recommendations
|
||
|
||
| Area | Recommendation |
|
||
|------|----------------|
|
||
| **Cloud** | GCP with 4 projects (prod, staging, shared, sandbox) |
|
||
| **Compute** | GKE Autopilot, private nodes, HPA |
|
||
| **Database** | MongoDB Atlas on GCP with multi-AZ, automated backups |
|
||
| **CI/CD** | GitHub Actions + ArgoCD/Flux |
|
||
| **Security** | Private VPC, TLS everywhere, Secret Manager, least privilege |
|
||
| **Cost** | Start small; use committed use discounts as usage grows |
|
||
|
||
---
|
||
|
||
*This document should be accompanied by an HLD diagram (draw.io or Lucidchart) reflecting the architecture above.*
|