8 Commits

Author SHA1 Message Date
a44aef5381 Simplify docs exclusion: use paths-ignore on push trigger
Helm Chart CI & Release / Lint Helm Chart (push) Successful in 10s
Helm Chart CI & Release / Semantic Release (push) Successful in 10s
Replace in-job file check with paths-ignore filter.
Workflow won't trigger at all for docs-only changes.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-19 20:58:48 +00:00
4a278b1419 Fix CI checkout: use token auth for git clone
Helm Chart CI & Release / Lint Helm Chart (push) Successful in 9s
Helm Chart CI & Release / Semantic Release (push) Successful in 10s
Repo requires authentication; use gitea.token in clone URLs.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-19 20:56:50 +00:00
698c977511 Skip release for docs-only changes
Helm Chart CI & Release / Lint Helm Chart (push) Successful in 10s
Helm Chart CI & Release / Semantic Release (push) Successful in 10s
Semantic release now checks changed files and skips tag/publish
when only docs, README, STATUS, AGENTS, or .gitignore are modified.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-19 20:54:42 +00:00
86108f5b75 Minor docs change
Helm Chart CI & Release / Lint Helm Chart (push) Successful in 9s
Helm Chart CI & Release / Semantic Release (push) Successful in 9s
2026-02-19 20:40:22 +00:00
fb92b4c000 Minor docs change 2026-02-19 20:35:53 +00:00
ce0851dc3c Add 'what would be overkill' section to architecture doc
Helm Chart CI & Release / Lint Helm Chart (push) Successful in 10s
Helm Chart CI & Release / Semantic Release (push) Successful in 10s
Pragmatic analysis of components that add cost/complexity without
value at startup scale, with guidance on when to introduce each.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-19 20:33:18 +00:00
edc552413e Architecture: cost optimisation, blue-green deployment, reduce to 3 projects
Helm Chart CI & Release / Lint Helm Chart (push) Failing after 1s
Helm Chart CI & Release / Semantic Release (push) Has been skipped
- Reduce from 4 to 3 GCP projects (drop sandbox, use staging namespaces)
- Add blue-green deployment strategy via Argo Rollouts
- Add cost optimisation section with monthly estimate (~$175-245)
- Add blue-green flow diagram and cost pie chart to HLD

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-19 20:32:30 +00:00
25d4610903 Fix release: use Gitea API directly instead of gitea-release-action
Helm Chart CI & Release / Lint Helm Chart (push) Successful in 9s
Helm Chart CI & Release / Semantic Release (push) Successful in 10s
The action requires Node 18+ (Headers API) but runner uses Node 16.
Use curl against Gitea API for release creation and asset upload.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-19 20:02:43 +00:00
3 changed files with 162 additions and 45 deletions
+31 -16
View File
@@ -1,5 +1,5 @@
# FleetDM Stack - Gitea Actions # FleetDM Stack - Gitea Actions
# CI: lint on every push # CI: lint on every push (skips docs-only changes)
# Semantic Release: auto-bump version on push to main/master # Semantic Release: auto-bump version on push to main/master
# - merge from feature/* branch → major bump # - merge from feature/* branch → major bump
# - any other commit (fix, chore, etc.) → patch bump # - any other commit (fix, chore, etc.) → patch bump
@@ -12,6 +12,14 @@ on:
branches: branches:
- main - main
- master - master
paths-ignore:
- 'docs/**'
- 'README.md'
- 'STATUS.md'
- 'AGENTS.md'
- 'TASKS.md'
- '.gitignore'
- 'djinni-*/**'
pull_request: pull_request:
branches: branches:
- main - main
@@ -24,7 +32,7 @@ jobs:
steps: steps:
- name: Checkout - name: Checkout
run: | run: |
git clone --depth=1 https://git.produktor.io/${{ gitea.repository }}.git . git clone --depth=1 https://${{ gitea.actor }}:${{ gitea.token }}@git.produktor.io/${{ gitea.repository }}.git .
git checkout ${{ gitea.sha }} git checkout ${{ gitea.sha }}
- name: Install Helm - name: Install Helm
@@ -48,7 +56,7 @@ jobs:
steps: steps:
- name: Checkout (full history for tags) - name: Checkout (full history for tags)
run: | run: |
git clone https://git.produktor.io/${{ gitea.repository }}.git . git clone https://${{ gitea.actor }}:${{ gitea.token }}@git.produktor.io/${{ gitea.repository }}.git .
git fetch --tags git fetch --tags
- name: Determine version bump - name: Determine version bump
@@ -60,13 +68,11 @@ jobs:
fi fi
echo "Latest tag: $LATEST_TAG" echo "Latest tag: $LATEST_TAG"
# Strip 'v' prefix and split
VER="${LATEST_TAG#v}" VER="${LATEST_TAG#v}"
MAJOR=$(echo "$VER" | cut -d. -f1) MAJOR=$(echo "$VER" | cut -d. -f1)
MINOR=$(echo "$VER" | cut -d. -f2) MINOR=$(echo "$VER" | cut -d. -f2)
PATCH=$(echo "$VER" | cut -d. -f3) PATCH=$(echo "$VER" | cut -d. -f3)
# Check if this commit is a merge from a feature/* branch
COMMIT_MSG=$(git log -1 --format='%s' ${{ gitea.sha }}) COMMIT_MSG=$(git log -1 --format='%s' ${{ gitea.sha }})
echo "Commit message: $COMMIT_MSG" echo "Commit message: $COMMIT_MSG"
@@ -74,7 +80,6 @@ jobs:
if echo "$COMMIT_MSG" | grep -qiE "^Merge.*feature/"; then if echo "$COMMIT_MSG" | grep -qiE "^Merge.*feature/"; then
IS_FEATURE="true" IS_FEATURE="true"
fi fi
# Also check parent branches for merge commits
if git log -1 --format='%P' ${{ gitea.sha }} | grep -q ' '; then if git log -1 --format='%P' ${{ gitea.sha }} | grep -q ' '; then
MERGE_BRANCH=$(git log -1 --format='%s' ${{ gitea.sha }} | grep -oE "feature/[^ '\"]*" || true) MERGE_BRANCH=$(git log -1 --format='%s' ${{ gitea.sha }} | grep -oE "feature/[^ '\"]*" || true)
if [ -n "$MERGE_BRANCH" ]; then if [ -n "$MERGE_BRANCH" ]; then
@@ -124,14 +129,24 @@ jobs:
git push https://${{ gitea.actor }}:${{ gitea.token }}@git.produktor.io/${{ gitea.repository }}.git "${{ steps.version.outputs.new_tag }}" git push https://${{ gitea.actor }}:${{ gitea.token }}@git.produktor.io/${{ gitea.repository }}.git "${{ steps.version.outputs.new_tag }}"
- name: Create Gitea Release - name: Create Gitea Release
uses: https://gitea.com/actions/gitea-release-action@v1 run: |
with: TAG="${{ steps.version.outputs.new_tag }}"
server_url: ${{ gitea.server_url }} BUMP="${{ steps.version.outputs.bump_type }}"
token: ${{ gitea.token }} API="https://git.produktor.io/api/v1/repos/${{ gitea.repository }}/releases"
tag_name: ${{ steps.version.outputs.new_tag }} TOKEN="${{ gitea.token }}"
name: FleetDM Stack ${{ steps.version.outputs.new_tag }}
body: |
**${{ steps.version.outputs.bump_type }}** release — `${{ steps.version.outputs.new_tag }}`
Helm chart for FleetDM Server with MySQL and Redis. RELEASE=$(curl -sf -X POST "$API" \
files: .tmp/*.tgz -H "Authorization: token $TOKEN" \
-H "Content-Type: application/json" \
-d "{\"tag_name\":\"$TAG\",\"name\":\"FleetDM Stack $TAG\",\"body\":\"**${BUMP}** release — \`${TAG}\`\n\nHelm chart for FleetDM Server with MySQL and Redis.\"}")
RELEASE_ID=$(echo "$RELEASE" | grep -o '"id":[0-9]*' | head -1 | cut -d: -f2)
echo "Created release ID: $RELEASE_ID"
for f in .tmp/*.tgz; do
FNAME=$(basename "$f")
curl -sf -X POST "$API/$RELEASE_ID/assets?name=$FNAME" \
-H "Authorization: token $TOKEN" \
-H "Content-Type: application/octet-stream" \
--data-binary "@$f"
echo "Uploaded: $FNAME"
done
+87 -19
View File
@@ -10,7 +10,7 @@
This document outlines a robust, scalable, secure, and cost-effective infrastructure design for Company Inc., a startup deploying a web application with a Python/Flask REST API backend, React SPA frontend, and MongoDB database. The design leverages **Google Cloud Platform (GCP)** with **GKE (Google Kubernetes Engine)** as the primary compute platform. This document outlines a robust, scalable, secure, and cost-effective infrastructure design for Company Inc., a startup deploying a web application with a Python/Flask REST API backend, React SPA frontend, and MongoDB database. The design leverages **Google Cloud Platform (GCP)** with **GKE (Google Kubernetes Engine)** as the primary compute platform.
**Key Design Principles:** Security-by-default, scalability from day one, cost optimization for early stage, and GitOps-based operations. **Key Design Principles:** Cost awareness from day one, security-by-default, scalability when needed, and GitOps-based operations.
--- ---
@@ -20,20 +20,26 @@ This document outlines a robust, scalable, secure, and cost-effective infrastruc
**Rationale:** GCP offers strong managed Kubernetes (GKE) with autopilot options, excellent MongoDB Atlas integration (or GCP-native DocumentDB alternatives), competitive pricing for startups, and simplified networking. GKE Autopilot reduces operational overhead for a small team with limited Kubernetes expertise. **Rationale:** GCP offers strong managed Kubernetes (GKE) with autopilot options, excellent MongoDB Atlas integration (or GCP-native DocumentDB alternatives), competitive pricing for startups, and simplified networking. GKE Autopilot reduces operational overhead for a small team with limited Kubernetes expertise.
### 2.2 Multi-Project Structure ### 2.2 Project Structure (Cost-Optimised)
For a startup, fewer projects mean lower overhead and simpler billing. Start with **3 projects** and add more only when traffic or compliance demands it.
| Project | Purpose | Isolation | | Project | Purpose | Isolation |
|---------|---------|-----------| |---------|---------|-----------|
| **company-inc-prod** | Production workloads | High; sensitive data | | **company-inc-prod** | Production workloads | High; sensitive data |
| **company-inc-staging** | Staging / pre-production | Medium | | **company-inc-staging** | Staging, QA, and dev experimentation | Medium |
| **company-inc-shared** | CI/CD, shared tooling, DNS | Low; no PII | | **company-inc-shared** | CI/CD, Artifact Registry, DNS | Low; no PII |
| **company-inc-sandbox** | Dev experimentation | Lowest |
**Why not 4+ projects?**
- A dedicated sandbox project adds billing, IAM, and networking overhead with little benefit at startup scale.
- Developers can use Kubernetes namespaces within the staging cluster for experimentation.
- A fourth project can be introduced later when team size or compliance (SOC2, HIPAA) requires it.
**Benefits:** **Benefits:**
- Billing separation per environment - Billing separation (prod costs are clearly visible)
- Blast-radius containment (prod issues do not affect staging) - Blast-radius containment (prod issues do not affect staging)
- IAM and network isolation - IAM isolation between environments
- Aligns with GCP best practices for multi-tenant or multi-env setups - Minimal fixed cost — only 3 projects to manage
--- ---
@@ -96,14 +102,37 @@ flowchart TD
- **Frontend (React):** Static assets served via CDN or container; 12 replicas - **Frontend (React):** Static assets served via CDN or container; 12 replicas
- **Ingress:** GKE Ingress for HTTP(S) routing; consider GKE Gateway API for advanced use - **Ingress:** GKE Ingress for HTTP(S) routing; consider GKE Gateway API for advanced use
### 4.4 Containerisation and CI/CD ### 4.4 Blue-Green Deployment
Zero-downtime releases without duplicating infrastructure. Both versions run inside the **same GKE cluster**; the load balancer switches traffic atomically.
```mermaid
flowchart LR
LB[Load Balancer]
LB -->|100% traffic| Green[Green — v1.2.0<br/>current stable]
LB -.->|0% traffic| Blue[Blue — v1.3.0<br/>new release]
Blue -.->|smoke tests pass| LB
```
---
| Phase | Action |
|-------|--------|
| **Deploy** | New version deployed to the idle slot (blue) |
| **Test** | Run smoke tests / synthetic checks against blue |
| **Switch** | Update Service selector or Ingress to point to blue |
| **Rollback** | Instant — revert selector back to green (old version still running) |
| **Cleanup** | Scale down old slot after confirmation period |
**Cost impact:** Near-zero — both slots share the same node pool; the idle slot consumes minimal resources until traffic is switched. Argo Rollouts automates the full lifecycle within ArgoCD.
### 4.5 Containerisation and CI/CD
| Aspect | Approach | | Aspect | Approach |
|-------|----------| |-------|----------|
| **Image build** | Dockerfile per service; multi-stage builds; non-root user | | **Image build** | Dockerfile per service; multi-stage builds; non-root user |
| **Registry** | Artifact Registry (GCR) in `company-inc-shared` | | **Registry** | Artifact Registry in `company-inc-shared` |
| **CI** | GitHub Actions (or GitLab CI) — build, test, security scan | | **CI** | GitHub/Gitea Actions — build, test, security scan |
| **CD** | ArgoCD or Flux — GitOps; app of apps pattern | | **CD** | ArgoCD + Argo Rollouts — GitOps with blue-green strategy |
| **Secrets** | External Secrets Operator + GCP Secret Manager | | **Secrets** | External Secrets Operator + GCP Secret Manager |
--- ---
@@ -138,10 +167,48 @@ flowchart TD
--- ---
## 6. High-Level Architecture Diagram ## 6. Cost Optimisation Strategy
| Lever | Approach | Estimated Savings |
|-------|----------|-------------------|
| **3 projects, not 4** | Drop sandbox; use staging namespaces | ~25% fewer fixed project costs |
| **GKE Autopilot** | Pay per pod, not per node; no idle nodes | 3060% vs standard GKE |
| **Blue-green in-cluster** | No duplicate environments for releases | Near-zero deployment cost |
| **Spot/preemptible pods** | Use for staging and non-critical workloads | Up to 6080% off compute |
| **Committed use discounts** | 1-year CUDs once baseline is established | 2030% off sustained use |
| **CDN for frontend** | Offload SPA traffic from GKE | Fewer pod replicas needed |
| **MongoDB Atlas auto-scale** | Start M10; scale up only when needed | Avoid over-provisioning |
| **Cloud NAT shared** | Single NAT in shared project | Avoid per-project NAT cost |
**Monthly cost estimate (early stage):**
- GKE Autopilot (23 API pods + 1 SPA): ~$80150
- MongoDB Atlas M10: ~$60
- Load Balancer + Cloud NAT: ~$30
- Artifact Registry + Secret Manager: ~$5
- **Total: ~$175245/month**
### 6.1 What Would Be Overkill at This Stage
Not everything in a "best practices" architecture is worth implementing on day one. The following are valuable at scale but add cost and complexity that a startup with a few hundred users/day does not need yet.
| Component | Why it's overkill now | When to introduce |
|-----------|----------------------|-------------------|
| **Multi-region GKE** | Single region handles millions of req/day; multi-region doubles cost | When SLA requires 99.99% or users span continents |
| **Service mesh (Istio/Linkerd)** | Adds sidecar overhead, complexity, and debugging difficulty | When you have 10+ microservices with mTLS requirements |
| **Cross-region MongoDB replica** | Atlas M10 with multi-AZ is sufficient; cross-region adds ~2x DB cost | When RPO < 1 hour is a compliance requirement |
| **Dedicated observability stack** | GKE built-in monitoring + Cloud Logging is free; Prometheus/Grafana adds ops burden | When team has > 2 SREs and needs custom dashboards |
| **4+ GCP projects** | 3 projects cover prod/staging/shared; more adds IAM and billing complexity | When compliance (SOC2, HIPAA) requires strict separation |
| **API Gateway (Apigee, Kong)** | GKE Ingress handles routing; a gateway adds cost and latency | When you need rate limiting, API keys, or monetisation |
| **Vault for secrets** | GCP Secret Manager is cheaper, simpler, and natively integrated | When you need dynamic secrets or multi-cloud secret federation |
**Rule of thumb:** if a component doesn't solve a problem you have *today*, defer it. Every added piece increases the monthly bill and the on-call surface area.
---
## 7. High-Level Architecture Diagram
```mermaid ```mermaid
flowchart TB flowchart TD
Users((Users)) Users((Users))
Users --> CDN[Cloud CDN<br/>Static Assets] Users --> CDN[Cloud CDN<br/>Static Assets]
@@ -164,21 +231,22 @@ flowchart TB
API --> Mongo API --> Mongo
API --> Secrets API --> Secrets
GKE --> Registry GKE ----> Registry
``` ```
--- ---
## 7. Summary of Recommendations ## 8. Summary of Recommendations
| Area | Recommendation | | Area | Recommendation |
|------|----------------| |------|----------------|
| **Cloud** | GCP with 4 projects (prod, staging, shared, sandbox) | | **Cloud** | GCP with 3 projects (prod, staging, shared) |
| **Compute** | GKE Autopilot, private nodes, HPA | | **Compute** | GKE Autopilot, private nodes, HPA |
| **Deployments** | Blue-green via Argo Rollouts — zero downtime, instant rollback |
| **Database** | MongoDB Atlas on GCP with multi-AZ, automated backups | | **Database** | MongoDB Atlas on GCP with multi-AZ, automated backups |
| **CI/CD** | GitHub Actions + ArgoCD/Flux | | **CI/CD** | GitHub/Gitea Actions + ArgoCD |
| **Security** | Private VPC, TLS everywhere, Secret Manager, least privilege | | **Security** | Private VPC, TLS everywhere, Secret Manager, least privilege |
| **Cost** | Start small; use committed use discounts as usage grows | | **Cost** | ~$175245/month early stage; spot pods, CUDs as traffic grows |
--- ---
+44 -10
View File
@@ -9,22 +9,25 @@ flowchart TB
end end
subgraph GCP["Google Cloud Platform"] subgraph GCP["Google Cloud Platform"]
subgraph Projects["Project Structure"] subgraph Projects["Project Structure (3 projects)"]
Prod[company-inc-prod] Prod[company-inc-prod]
Staging[company-inc-staging] Staging[company-inc-staging<br/>QA + dev namespaces]
Shared[company-inc-shared] Shared[company-inc-shared]
Sandbox[company-inc-sandbox]
end end
subgraph Edge["Edge / Networking"] subgraph Edge["Edge / Networking"]
LB[Cloud Load Balancer<br/>HTTPS · TLS termination] LB[Cloud Load Balancer<br/>HTTPS · TLS termination]
CDN[Cloud CDN<br/>Static Assets] CDN[Cloud CDN<br/>Static Assets]
NAT[Cloud NAT<br/>Egress] NAT[Cloud NAT<br/>Egress · shared]
end end
subgraph VPC["VPC — Private Subnets"] subgraph VPC["VPC — Private Subnets"]
subgraph GKE["GKE Autopilot Cluster"] subgraph GKE["GKE Autopilot Cluster"]
Ingress[Ingress Controller] Ingress[Ingress Controller]
subgraph BlueGreen["Blue-Green Deployment"]
Green[Green — stable<br/>receives traffic]
Blue[Blue — new release<br/>smoke tests]
end
subgraph Workloads subgraph Workloads
API[Backend — Python / Flask<br/>HPA · 23 replicas] API[Backend — Python / Flask<br/>HPA · 23 replicas]
SPA[Frontend — React SPA<br/>Nginx] SPA[Frontend — React SPA<br/>Nginx]
@@ -44,14 +47,17 @@ flowchart TB
subgraph CICD["CI / CD"] subgraph CICD["CI / CD"]
Git[Git Repository] Git[Git Repository]
Actions[Gitea / GitHub Actions<br/>Build · Test · Scan] Actions[Gitea / GitHub Actions<br/>Build · Test · Scan]
Argo[ArgoCD / Flux<br/>GitOps Deploy] Argo[ArgoCD + Argo Rollouts<br/>GitOps · Blue-Green]
end end
Users --> LB Users --> LB
Users --> CDN Users --> CDN
LB --> Ingress LB --> Ingress
CDN --> SPA CDN --> SPA
Ingress --> API Ingress -->|traffic| Green
Ingress -.->|after switch| Blue
Green --> API
Blue --> API
Ingress --> SPA Ingress --> SPA
API --> Redis API --> Redis
API --> Mongo API --> Mongo
@@ -61,7 +67,25 @@ flowchart TB
Git --> Actions Git --> Actions
Actions --> Registry Actions --> Registry
Argo --> GKE Argo ----> GKE
```
## Blue-Green Deployment Flow
```mermaid
flowchart LR
subgraph Cluster["GKE Cluster"]
LB[Load Balancer<br/>Service Selector]
Green[Green — v1.2.0<br/>current stable]
Blue[Blue — v1.3.0<br/>new release]
end
Deploy[ArgoCD<br/>Argo Rollouts] -->|deploy new version| Blue
Blue -->|smoke tests| Check{Tests pass?}
Check -->|yes| LB
LB -->|switch 100%| Blue
Check -->|no| Rollback[Rollback<br/>keep Green]
LB -.->|instant rollback| Green
``` ```
## CI / CD Pipeline ## CI / CD Pipeline
@@ -72,17 +96,27 @@ flowchart LR
Repo -->|webhook| CI[CI Pipeline<br/>lint · test · build] Repo -->|webhook| CI[CI Pipeline<br/>lint · test · build]
CI -->|push image| Registry[Artifact Registry] CI -->|push image| Registry[Artifact Registry]
CI -->|update manifests| GitOps[GitOps Repo] CI -->|update manifests| GitOps[GitOps Repo]
GitOps -->|sync| Argo[ArgoCD / Flux] GitOps -->|sync| Argo[ArgoCD]
Argo -->|deploy| GKE[GKE Cluster] Argo -->|blue-green deploy| GKE[GKE Cluster]
``` ```
## Network Security Layers ## Network Security Layers
```mermaid ```mermaid
flowchart TD flowchart LR
Internet((Internet)) --> FW[VPC Firewall<br/>Default deny] Internet((Internet)) --> FW[VPC Firewall<br/>Default deny]
FW --> LB[Load Balancer<br/>HTTPS only] FW --> LB[Load Balancer<br/>HTTPS only]
LB --> NP[K8s Network Policies] LB --> NP[K8s Network Policies]
NP --> Pods[Application Pods<br/>Private IPs only] NP --> Pods[Application Pods<br/>Private IPs only]
Pods --> PE[Private Endpoint<br/>MongoDB Atlas] Pods --> PE[Private Endpoint<br/>MongoDB Atlas]
``` ```
## Cost Profile (Early Stage)
```mermaid
pie title Monthly Cost Breakdown (~$200)
"GKE Autopilot" : 120
"MongoDB Atlas M10" : 60
"LB + NAT" : 30
"Registry + Secrets" : 5
```