Real-World Scenarios

End-to-end architecture designs, here's the problem, here's how you'd solve it


1. Startup Office Network

🏢 Ten people, one office

The Problem

A small startup needs a reliable office network for about ten people: fast WiFi and wired desks, a separate guest network, basic security, and a way for remote workers to reach internal resources without exposing everything to the internet.

Internet
ISP Router / Modem
Firewall NAT, policies, VPN endpoint
L3 Switch SVIs, inter-VLAN routing
VLAN 10
Workstations
VLAN 20
Servers / NAS
VLAN 30
Guest WiFi

Remote workers → VPN client → Firewall VPN → same L3 switch path to VLAN 10/20

Key design decisions

DecisionWhy it matters
VLANsIsolate staff, servers, and guests so broadcast domains and blast radius stay small.
L3 switchRoute between VLANs on-device instead of hair-pinning through a separate router.
VPNEncrypt remote access; avoid exposing management or file shares directly.
WiFiWPA3-Enterprise or WPA3-Personal on staff SSID; captive portal optional on guest.
Rough cost$2–5K (firewall + managed L3 switch + APs + cabling)

Key takeaways

  • Segment early with VLANs; guest traffic should never trust internal subnets.
  • Put VPN termination on the firewall and keep rules explicit (who can reach which VLAN).
  • Document IP plans and DHCP scopes per VLAN before you plug in the tenth laptop.
2. 3-Tier Web App on GCP

☁️ Production web app on Google Cloud

The Problem

You need to run a production web application with a database, HTTPS everywhere, autoscaling frontends, protected APIs, and a CDN, without managing physical load balancers or TLS certificates on each VM.

Users
Cloud CDN
Cloud Armor WAF / L7 rules
External HTTPS LB
Frontend, GCE MIG us-central1
Internal LB
Backend API, GKE
Cloud SQL Private IP · us-central1 · MySQL/PostgreSQL

VPC design

VPCCustom mode VPC in us-central1 web-subnet10.0.1.0/24, frontend MIG NICs api-subnet10.0.2.0/24, GKE nodes / internal LB db-subnet10.0.3.0/24, private IP for Cloud SQL

Firewall rules (conceptual)

RuleAllow
Ingress to LBTCP 80/443 from 0.0.0.0/0 (or restricted ranges) to tagged frontend targets
Web → APIFrom frontend service account / subnet to internal LB or pod IPs on app ports
API → DBTCP 3306 (or 5432) from GKE nodes or serverless connector to Cloud SQL private IP only

Key GCP services

ServiceRole
Cloud CDNCache static assets at the edge; reduce origin load
Cloud ArmorWAF, rate limits, geo/IP allow lists in front of the external LB
External HTTPS LBManaged certs, global anycast front end, health-checked backends
Compute Engine MIGAutoscaling stateless web tier
Internal LB + GKEPrivate API tier; no public IPs on workloads
Cloud SQLManaged DB with private IP and automated backups

Key takeaways

  • Keep the data plane private: SQL has no public IP; APIs sit behind an internal LB.
  • Layer defense: CDN → Armor → LB → autoscaling groups, each with a clear responsibility.
  • Align subnets with trust zones and enforce least privilege in VPC firewall rules or firewall policies.
3. Hybrid Cloud, On-Prem + GCP

🔗 Enterprise bridge to the cloud

The Problem

An enterprise with an on-premises data center needs secure, predictable connectivity into GCP so teams can use Shared VPC, private Google APIs, and lift-and-shift or hybrid workloads without sending everything over the public internet unencrypted.

On-Prem DC
HA VPN
or
Dedicated Interconnect
Cloud Router BGP sessions
Shared VPC Host project
Service project · Dev
Service project · Staging
Service project · Prod

DNS: Cloud DNS inbound forwarding + on-prem conditional forwarders (or outbound to corporate resolvers) so names resolve in both directions.

VPN vs interconnect (decision guide)

OptionBandwidthCost / complexityLatency
HA VPNUp to a few Gbps aggregate (depends on tunnels)Lower capex; quick to stand upInternet path; more jitter
Partner Interconnect50 Mbps–50 GbpsCarrier engagement; recurring port feeOften better than VPN; not private fiber
Dedicated Interconnect10–100 GbpsHighest commitment; physical POP workLowest, most consistent to Google edge
BGPAdvertise on-prem prefixes into GCP and learn VPC subnets; control path preference with MED/AS path as needed. Overlapping CIDRsAvoid if possible; otherwise use separate VPCs, NAT, or application-level gateways, never “double NAT” without a design doc. FailoverTwo tunnels or two interconnect attachments; tune BGP so standby paths activate when primaries drop. BandwidthPlan for backup windows, DR, and burst traffic, interconnect is often cheaper per Gbps at scale.

Key takeaways

  • Shared VPC keeps networking consistent while service projects isolate IAM and billing.
  • Treat hybrid routing as a product: document prefixes, BGP ASN, and what happens when a link fails.
  • DNS and private Google Access are as important as the tunnel, apps break when names or APIs don’t resolve.
4. Multi-Region HA Architecture

🌍 Global uptime and low latency

The Problem

A global-facing application must stay available (think 99.99% SLO), survive regional outages, and serve users from nearby edges, while keeping a clear story for data replication, failover, and how long recovery takes.

Global HTTPS LB Anycast IP
Region 1 · us-central1
GKE primary serving
Cloud SQL primary
+
Region 2 · europe-west1
GKE warm standby / active
Cloud SQL read replica / standby
Cloud Spanner global consistency (where the model fits)

Failover: LB health checks detach unhealthy backends; promote replica or use managed failover; Cloud DNS geolocation or health-checked routing steers users.

RPO / RTO considerations

TopicWhat to decide
RPOHow much data loss is acceptable, synchronous replication vs async replica lag
RTOTime to redirect traffic, promote DB, or shift to standby GKE, practice with game days
Health checksAlign probe paths with real user journeys (not just TCP open on port 443)
DNS TTLLower TTL before changes; balance with resolver caching and cost of churn

Cross-region networking

VPCGlobal resource; subnets are regional, attach resources per region to local subnets. Private backboneEast-west traffic between regions stays on Google’s network when using private IPs and appropriate routing.

Key takeaways

  • Multi-region is a data problem first: pick Spanner, global DB patterns, or explicit async replication with known lag.
  • Run regional GKE clusters; use global LB to front them, avoid single-region single points of failure.
  • Document RPO/RTO and test failover quarterly; dashboards alone don’t fail over the database.
5. Zero Trust Network on GCP

🔐 No implicit trust by network location

The Problem

Traditional “castle and moat” designs trust anything inside the corporate network. Zero trust assumes breach: every access request is authenticated, authorized, encrypted, and logged, regardless of whether the user is on VPN or in a coffee shop.

Before / after

Before

  • Public IPs on VMs
  • VPN for “trusted” access
  • Perimeter firewall = main control
  • “Inside the VPC” = trusted

After

  • No public IPs on workloads
  • IAP for human access to UIs/SSH
  • VPC SC + IAM at every hop
  • Verify explicitly, always

Architecture (BeyondCorp-style on GCP)

BeyondCorp mindset identity + device + context
IAP, Identity-Aware Proxy
Backend services private IPs only
VPC Service Controls service perimeters
+
Private Google Access
OS Login
Binary Authorization

Key components

ComponentPurpose
IAPOAuth 2.0 gate in front of HTTPS resources and TCP tunnel for SSH/RDP without bastion IPs
VPC Service ControlsPerimeters that constrain data exfil from GCP APIs; VPC SC bridges for controlled egress
Private Google Access (PGA)Reach Google APIs from RFC1918 without public internet paths
OS LoginIAM-bound Linux user accounts and sudo policies on GCE
Binary AuthorizationOnly deploy container images signed / attested per policy (supply chain)

Key takeaways

  • Replace “VPN to the network” with “IAP to the resource” where possible, smaller blast radius.
  • VPC SC complements IAM: it limits what even valid credentials can do at the API edge.
  • Zero trust is continuous: patch devices, refresh tokens, audit logs, and tighten policies as apps change.