Google Cloud Platform: Complete Service Guide

Every GCP service from Compute to AI. Zero fluff, all clarity.

5 Compute
7 Storage & DB
6 Data & Analytics
3 AI/ML
5 Security
4 DevOps
3 Ops
3 Mgmt

☁️ Compute Services

1. Compute Engine

πŸ–₯️ Compute Engine, Virtual Machines

IaaS

Fully customizable VMs running on Google's infrastructure. Choose CPU, memory, GPU, disk, and OS.

FamilySeriesvCPUsMemoryUse Case
GeneralE2, N2, N2D, N12–2241–896 GBWeb servers, small/mid DBs, dev/test
ComputeC2, C2D, C34–17616–704 GBHPC, gaming, single-threaded apps
MemoryM1, M2, M332–416256 GB–12 TBSAP HANA, in-memory DBs, analytics
AcceleratorA2, A3 (GPU)12–96170–1360 GBML training, rendering, HPC

βš™οΈ Key Features

Custom TypesPick exact vCPU + memory ratio Spot/PreemptibleUp to 80% cheaper, can be reclaimed Sole-tenantDedicated physical server (compliance) Live MigrationZero-downtime host maintenance Instance TemplatesReusable VM configuration MIGManaged Instance Group, auto-scaling + healing UIGUnmanaged Instance Group, heterogeneous VMs Persistent DiskNetwork-attached: Standard, Balanced, SSD Local SSDPhysically attached, 375 GB per disk, ephemeral

πŸ”„ Managed Instance Group (MIG) Flow

Instance Template
β†’
Managed Instance Group
β†’
Auto-scaling
β†’
Health Check
β†’
Auto-healing
Scaling signalsCPU, LB utilization, Pub/Sub, custom metrics Rolling updatesCanary, max-surge, max-unavailable Multi-zone MIGSpread across zones for HA

πŸ’° Pricing Models

ModelDiscountDetails
On-demand0%Pay per second, no commitment
Sustained Use (SUD)Up to 30%Auto-applied for consistent monthly usage
Committed Use (CUD 1yr)Up to 37%1-year commitment on vCPU + memory
Committed Use (CUD 3yr)Up to 55%3-year commitment, highest savings
Spot / Preemptible60–91%Can be reclaimed with 30s notice
2. Cloud Functions

⚑ Serverless Functions

Serverless

Event-driven, pay-per-invocation compute. Write a function, pick a trigger, deploy.

TriggerSource
HTTPDirect HTTPS endpoint
Pub/SubMessage published to topic
Cloud StorageObject create/delete/archive
FirestoreDocument write/update/delete
FirebaseAuth, Analytics, Remote Config
Cloud SchedulerCron-scheduled invocations
Eventarc90+ event sources (Audit Logs, custom)

πŸ†š Gen 1 vs Gen 2

Gen 1

  • Max timeout: 9 min
  • 1 concurrent request per instance
  • Limited event sources
  • Max 8 GB memory
vs

Gen 2 Recommended

  • Max timeout: 60 min
  • Up to 1000 concurrent requests
  • Eventarc (90+ sources)
  • Up to 32 GB memory
  • Traffic splitting (revisions)
  • Built on Cloud Run

πŸ› οΈ Runtimes & Config

Node.jsPythonGoJava.NETRubyPHP
Cold Start~100ms–2s depending on runtime + deps Max InstancesConfigurable per function (default 100) Min InstancesKeep warm to avoid cold starts (Gen 2) VPC ConnectorAccess private VPC resources SecretsMount from Secret Manager ConcurrencyGen 2 only, up to 1000 per instance
3. Cloud Run

πŸš€ Cloud Run, Serverless Containers

Serverless
Container Image
β†’
Cloud Run Service
β†’
Auto-scales 0 β†’ N
β†’
HTTPS Endpoint
Cloud Run ServiceMultiple Revisions (v1, v2, v3…)
Revision v1Traffic: 10%
Revision v2Traffic: 20%
Revision v3 (latest)Traffic: 70%
Autoscaler, 0 to 1000 container instances per revision

✨ Features

Scale to ZeroNo requests = no cost Any LanguageIf it fits in a container, it runs Custom DomainsMap your own domain w/ managed TLS ConcurrencyUp to 1000 requests per container instance Min InstancesKeep warm to eliminate cold starts VPC ConnectorAccess private VPC resources Cloud SQLBuilt-in proxy connector gRPCNative gRPC + HTTP/2 support JobsRun containers to completion (batch) Volume MountsGCS buckets, NFS, in-memory

πŸ“Š Compute Comparison, When to Use What

CriteriaCloud RunCloud FunctionsApp EngineGKE
UnitContainerFunctionAppPod
Scale to Zeroβœ…βœ…βœ… (Standard)❌
Custom Runtimeβœ… AnyLimitedFlex onlyβœ… Any
PricingPer request+CPUPer invocationPer instance-hrPer node
Max Timeout60 min60 min (Gen2)UnlimitedUnlimited
K8s KnowledgeNoneNoneNoneRequired
Best ForAPIs, microservicesEvent handlersFull web appsComplex platforms
4. App Engine

🌐 App Engine, Standard vs Flexible

PaaS
FeatureStandardFlexible
LanguagesPython, Java, Go, Node, PHP, RubyAny (custom Docker)
Scaling0 β†’ auto (rapid)1+ instances (slower)
StartupSecondsMinutes
PricingPer instance-hour (scale to 0)Per VM (always β‰₯ 1)
Custom RuntimeβŒβœ… Dockerfile
VPC AccessVia connectorNative VPC
SSH AccessβŒβœ…
WebSocketsβŒβœ…

🧩 Built-in Services

VersionsDeploy multiple versions simultaneously Traffic SplittingRoute % traffic to specific versions Cron JobsScheduled tasks via cron.yaml Task QueuesBackground task processing (Cloud Tasks) MemcacheBuilt-in caching (Standard only) FirewallIP-based ingress rules Identity-Aware ProxyAuth without code changes Custom DomainsMap domain with managed SSL

πŸ”„ App Engine Deployment Flow

Code + app.yaml
β†’
gcloud app deploy
β†’
Version Created
β†’
Traffic Splitting
β†’
Auto-scaling

⚠️ One App Engine app per project. Cannot change region once set. Consider Cloud Run for new projects.

5. Google Kubernetes Engine (GKE)

☸️ GKE Cluster Architecture

CaaS
GKE Cluster
Control Plane (Google-managed)API Server Β· etcd Β· Scheduler Β· Controller Manager
Node PoolsPool-1 (e2-standard-4 Γ— 3) Β· Pool-2 (n2-standard-8 Γ— 5) Β· GPU Pool (a2 Γ— 2)
Node β†’ Pod β†’ Container
Node β†’ Pod β†’ Container
Node β†’ Pod β†’ Container

πŸ†š Standard vs Autopilot

FeatureStandardAutopilot
Node ManagementYou manageGoogle manages
ScalingCluster + node autoscalerAuto per pod
PricingPay per node (VM)Pay per pod (CPU/mem)
SecurityYou hardenHardened by default
GPU / TPUβœ…βœ… (limited)
Privileged Podsβœ…βŒ
Best ForFull control, custom configsHands-off, cost efficiency

πŸ”‘ Key Concepts

Node PoolsGroups of nodes with same config; multi-zone for HA Workload IdentityMap K8s ServiceAccount ↔ GCP IAM SA (no keys!) Binary AuthorizationOnly deploy signed/trusted container images GKE Gateway APIAdvanced L7 routing (replaces Ingress) Config SyncGitOps, sync cluster state from Git repo Policy ControllerOPA Gatekeeper, enforce policies on K8s resources GKE SandboxgVisor-based container isolation Release ChannelsRapid / Regular / Stable, auto-upgrade cadence

🌐 GKE Networking

VPC-nativeAlias IPs for pods, native routing, no overlay ClusterIPInternal-only service (within cluster) NodePortExpose on each node's IP:port LoadBalancerProvision GCP L4 load balancer IngressGCP L7 HTTP(S) Load Balancer Network PoliciesPod-to-pod firewall rules (Calico / Dataplane V2) Private ClusterNodes have internal IPs only Dataplane V2eBPF-based networking (Cilium), faster, built-in policies

πŸ’Ύ Storage & Databases

6. Cloud Storage (GCS)

πŸͺ£ Storage Classes

ClassMin DurationAvailabilityRetrieval CostUse Case
StandardNone99.99% (multi) / 99.9% (region)FreeHot data, frequently accessed
Nearline30 days99.95% / 99.0%$0.01/GBBackups accessed monthly
Coldline90 days99.95% / 99.0%$0.02/GBDisaster recovery, quarterly access
Archive365 days99.95% / 99.0%$0.05/GBLong-term archives, compliance

All classes offer strong global consistency. Identical API, only pricing differs.

πŸ”§ Features

BucketsGlobally unique names, regional or multi-regional ObjectsImmutable once written (overwrite = new version) VersioningKeep all versions of an object Lifecycle RulesAuto-delete or transition classes by age/date Retention PolicyMinimum retention (compliance / WORM) Object HoldsEvent-based or temporary hold, prevent deletion Signed URLsTime-limited access without credentials ACLs + IAMFine-grained (ACL) or bucket-level (IAM) access Requester PaysData consumer pays for egress + operations

πŸ”„ Lifecycle & Integration

Upload Object
β†’
Standard Bucket
β†’
30d β†’ Nearline
β†’
90d β†’ Coldline
β†’
365d β†’ Archive
BigQueryFederated queries directly on GCS files DataflowSource/sink for batch + streaming pipelines GKECSI driver for volume mounts Cloud FunctionsTrigger on object create/delete/archive Transfer ServiceMove from AWS S3, Azure, on-prem
7. Cloud SQL

🐬 Managed Relational Database

Managed
EngineVersionsMax Storage
MySQL5.7, 8.064 TB
PostgreSQL12, 13, 14, 15, 1664 TB
SQL Server2017, 2019, 202264 TB
vCPUsUp to 96 vCPUs MemoryUp to 624 GB IOPSUp to 60,000 (SSD)

πŸ›‘οΈ Features

HARegional, synchronous replication + auto failover Read ReplicasSame region, cross-region, or external BackupsAutomated daily + on-demand PITRPoint-in-time recovery via binary logs Private IPVPC-peered, no public exposure MaintenanceConfigurable maintenance windows Query InsightsSlow query monitoring, query plans IAM AuthLog in using IAM instead of passwords

πŸ—οΈ HA Architecture

Primary InstanceZone A, Read + Write
⟡ synchronous replication ⟢
Standby InstanceZone B, Hot standby
Automatic FailoverDNS-based, ~60s failover, same IP

Read replicas use asynchronous replication, good for read scaling, not HA.

8. Cloud Spanner

🌍 Globally Distributed Relational Database

Global
WhatRelational DB with horizontal scaling + strong consistency ConsistencyExternal consistency (strongest) via TrueTime (atomic clocks + GPS) SLA99.999% (multi-region), "five nines" SQLGoogleSQL (ANSI SQL compliant) + PostgreSQL interface ShardingAutomatic, data split across nodes by primary key Multi-regionnam3, nam6, nam-eur-asia1, global configs ScaleAdd/remove nodes on the fly, linear throughput scaling

πŸ—οΈ Architecture

Spanner Instance (Global)Contains one or more databases
Region ANode 1, Node 2
Region BNode 3, Node 4
Region C (witness)Voting only
Splits (shards)Auto-distributed by primary key range β†’ Colossus storage

TrueTime: atomic clocks + GPS β†’ globally consistent timestamps β†’ external consistency without coordination lag.

πŸ“Š Spanner vs Cloud SQL

FeatureCloud SQLCloud Spanner
ScaleVertical (bigger VM)Horizontal (add nodes)
Max Size64 TBUnlimited (petabyte+)
Multi-regionRead replicas onlyNative multi-region writes
ConsistencyRegional strongGlobal external
SLA99.95%99.999%
Cost$$ (from ~$7/mo)$$$$ ($0.90/node-hr)
Best ForTraditional apps, small-mid scaleGlobal apps, financial, gaming
9. Firestore / Datastore

πŸ”₯ NoSQL Document Database

NoSQL

Native Mode

  • Real-time listeners
  • Offline support (mobile)
  • Firebase SDK
  • Security Rules
  • Strong consistency
vs

Datastore Mode

  • No real-time/offline
  • Server-side only
  • Higher write throughput
  • IAM-based access
  • Eventually consistent reads available

πŸ—‚οΈ Data Model & Features

Collection users
Document user_123 (name, email, age)
Subcollection user_123/orders
Document order_456 (item, qty, price)
Real-timeonSnapshot listeners push changes instantly OfflineLocal cache, sync when online (mobile/web) TransactionsACID transactions (up to 500 docs) IndexesAuto single-field + manual composite indexes TTLAuto-delete documents by timestamp field
10. Cloud Bigtable

⚑ Wide-Column NoSQL

Petabyte
ScalePetabyte-scale, billions of rows LatencySingle-digit millisecond (consistent) CompatibilityHBase API compatible Use CasesTime-series, IoT, analytics, ad-tech, finance ThroughputMillions of reads/writes per second ReplicationMulti-cluster, eventual consistency

πŸ—οΈ Architecture & Data Model

Client (cbt CLI / HBase / gRPC)
Bigtable Cluster, Nodes (compute)Node count determines throughput
Colossus (storage)SSD or HDD, decoupled from compute
ConceptDescription
Row KeyUnique identifier, lexicographically sorted
Column FamilyGroup of related columns (defined at table creation)
Column QualifierIndividual column within a family
CellValue at row Γ— column, timestamped (versioned)

Design row keys carefully: avoid hotspots, use reverse timestamps for time-series data.

11. Memorystore

πŸ’¨ Managed In-Memory Stores

FeatureRedisMemcached
Persistenceβœ… RDB/AOF❌
Data TypesStrings, Lists, Sets, Hashes, Sorted Sets, StreamsStrings only
Clusteringβœ… (Redis Cluster)βœ… (distributed)
Pub/Subβœ…βŒ
Replicationβœ… Read replicas + HA❌
Max Size300 GB5 TB (distributed)
Lua Scriptingβœ…βŒ

🎯 Use Cases & Config

Session CachingStore user sessions for web apps LeaderboardsSorted sets for real-time rankings Rate LimitingToken bucket / sliding window Real-time AnalyticsCounters, HyperLogLog HAAuto-failover (Redis Standard/HA tier) NetworkVPC-peered, private access only AuthIAM-based (Redis 7.0+) or AUTH string MonitoringCloud Monitoring metrics (hits, misses, memory)
12. AlloyDB

🐘 AlloyDB for PostgreSQL

PostgreSQL
Performance4Γ— faster than standard PostgreSQL (OLTP) Analytics100Γ— faster analytical queries (columnar engine) Compatibility100% PostgreSQL compatible (wire protocol) AI Built-inpgvector for vector search / embeddings HA99.99% SLA, auto-failover in < 30s ScaleUp to 128 vCPUs, 864 GB RAM, 64 TB storage

πŸ—οΈ Architecture, Disaggregated Storage

Primary Instance (Compute)Handles reads + writes
Ultra-fast Log ProcessingWAL processed before storage write, low-latency commits
Disaggregated Storage (Google Colossus)Shared across primary + read pools, automatic replication
Read Pool Instance 1
Read Pool Instance 2

HTAP: handle both OLTP and OLAP workloads in a single database with the columnar engine.

πŸ“Š AlloyDB vs Cloud SQL vs Spanner

FeatureCloud SQLAlloyDBCloud Spanner
EngineMySQL, PostgreSQL, SQL ServerPostgreSQL onlyGoogleSQL / PG interface
ScaleVerticalVertical + read poolsHorizontal (unlimited)
Multi-regionCross-region read replicasCross-region read replicasNative multi-region writes
OLTP SpeedBaseline4Γ— fasterComparable
AnalyticsLimited100Γ— (columnar engine)Good (SQL)
SLA99.95%99.99%99.999%
Cost$$$$$$$$$
Best ForStandard workloadsHigh-performance PG appsGlobal-scale apps

πŸ“Š Data & Analytics

13. BigQuery

πŸ” Serverless Data Warehouse

Serverless
ScalePetabyte-scale, query TBs in seconds StorageColumnar (Capacitor format) on Colossus ComputeDremel engine, massively parallel SQL execution via slots StreamingReal-time inserts via streaming API BQMLTrain ML models in SQL (linear reg, XGBoost, DNN, ARIMA+) BI EngineIn-memory acceleration for sub-second dashboards FederatedQuery external data: GCS, Cloud SQL, Sheets, Bigtable BigLakeUnified data lake, query across GCS + BQ with one interface

πŸ—οΈ Architecture

Data SourcesGCS, Pub/Sub, Cloud SQL, Sheets, APIs
BigQueryDremel Engine (compute) + Colossus (storage)
SQL Query
BQML
BI Engine
Results β†’ Looker Studio / Sheets / Export

πŸ’° Pricing

ComponentModelPrice
Queries (On-demand)Per TB scanned$5/TB (first 1 TB/mo free)
Queries (Capacity)Slot-based~$0.04/slot-hour (autoscale)
Active StoragePer GB/month$0.02/GB
Long-term Storage>90 days unmodified$0.01/GB
Streaming InsertsPer 200 MB$0.01

Cost optimization: use partitioning + clustering to minimize bytes scanned.

🧩 Key Features

FeatureWhat It Does
PartitioningSplit table by date/int range/ingestion time, prune scans
ClusteringSort data within partitions by columns, co-locate related rows
Materialized ViewsPre-computed aggregates, auto-refreshed
Scheduled QueriesCron-based SQL execution
Data TransfersImport from SaaS (GA, Ads, YouTube, S3)
BigLakeFine-grained access on data lake files
Analytics HubShare datasets across orgs (marketplace)
Change Data CaptureDatastream β†’ real-time CDC into BigQuery
14. Pub/Sub

πŸ“¨ Global Messaging Service

PublisherApp, Cloud Functions, IoT, gcloud
TopicNamed channel for messages
Subscription (Push)HTTP endpoint delivery
Subscription (Pull)Client polls for messages
SubscriberCloud Run, Dataflow, GKE, Functions

βš™οΈ Features

GlobalMulti-region by default, publish anywhere ServerlessNo infra to manage, auto-scales DeliveryAt-least-once (default), exactly-once (configurable) OrderingMessage ordering by ordering key Dead LetterRoute failed messages after N retries SchemaAvro / Protocol Buffer schema validation Retention7 days (configurable, up to 31 days) FilteringAttribute-based subscription filters ThroughputMillions of messages/sec

🎯 Use Cases

Event-drivenDecouple services, publish events, react asynchronously Streaming ETLPub/Sub β†’ Dataflow β†’ BigQuery (real-time pipeline) Fan-out1 topic β†’ N subscriptions (broadcast to multiple consumers) MicroservicesAsync communication between services IoT IngestionMillions of devices publishing sensor data Log AggregationCentralize logs from distributed systems
15. Dataflow

🌊 Managed Apache Beam Runner

Unified
Source
β†’
PCollection
β†’
Transforms (ParDo, GroupByKey, CoGroupByKey, Flatten)
β†’
PCollection
β†’
Sink
Sources:
Pub/SubGCSBigQueryKafkaJDBC
Sinks:
BigQueryGCSBigtablePub/SubSpanner

βš™οΈ Features

UnifiedSame code for batch + stream (Apache Beam SDK) Auto-scalingWorkers scale up/down based on backlog Exactly-onceGuaranteed processing semantics Streaming SQLWrite streaming pipelines in SQL TemplatesPre-built (Google-provided) + custom reusable jobs Flex TemplatesContainerized, custom deps, private repos LanguagesJava, Python, Go (via Apache Beam SDK)

πŸ†š Batch vs Streaming

AspectBatchStreaming
InputBounded (files, tables)Unbounded (Pub/Sub, Kafka)
LatencyMinutes–hoursSeconds
WindowingGlobal windowFixed / Sliding / Session
WorkersScale to 0 after jobAlways running
Use CaseETL, backfills, reportsReal-time dashboards, alerts
16. Dataproc

πŸ”₯ Managed Spark / Hadoop

Cluster Creation~90 seconds, fast spin-up Auto-scalingScale workers based on YARN metrics Component GatewayJupyter, Spark UI, Zeppelin, HDFS web UI Init ActionsCustom setup scripts at cluster start Preemptible Workers60–91% cheaper secondary workers EcosystemSpark, Hadoop, Hive, Pig, Presto, Flink ServerlessDataproc Serverless, submit jobs, no cluster management

πŸ†š Dataproc vs Dataflow

CriteriaDataproc (Spark)Dataflow (Beam)
EngineApache SparkApache Beam
ManagementClusters (or serverless)Fully serverless
Best ForExisting Spark jobs, ML (Spark ML), interactive analysisNew pipelines, streaming-first, unified batch/stream
LatencyMicro-batch (~500ms)True streaming (per-element)
Cost ModelCluster VMs (per second)Worker VMs (auto-managed)
PortabilityRun on any Spark clusterBeam runs on Flink, Spark, Dataflow

Rule of thumb: existing Spark β†’ Dataproc. New streaming β†’ Dataflow.

17. Cloud Composer

🎼 Managed Apache Airflow

WhatDAG-based workflow orchestration, fully managed DAGsPython-defined directed acyclic graphs OperatorsBigQuery, Dataflow, GCS, GKE, Cloud SQL, Dataproc… SensorsWait for file, time, external task, HTTP Composer 2Auto-scaling workers, faster scheduling, lower cost SecretsIntegration with Secret Manager MonitoringAirflow UI + Cloud Monitoring + Cloud Logging

πŸ”„ Orchestration Flow

DAG Definition (Python)
β†’
Airflow Scheduler
β†’
Worker Nodes
BigQuery job Dataflow pipeline GCS copy GKE workload Email notification

Composer environments run on GKE, you can customize machine types and node counts.

18. Looker & Looker Studio

πŸ“Š Enterprise BI vs Free Dashboards

FeatureLookerLooker Studio (free)
CostEnterprise licenseFree
ModelingLookML (semantic layer)No modeling layer
Data GovernanceCentralized metrics, row-level securityBasic sharing
Embedded Analyticsβœ… iframes, SSOβœ… embeddable reports
Custom VizCustom components (React)Community visualizations
APIFull REST APILimited
Best ForEnterprise data teamsQuick dashboards, individuals

πŸ“ˆ Looker Studio

CostCompletely free ConnectorsBigQuery, Sheets, Cloud SQL, 800+ connectors SharingLike Google Docs, view/edit permissions TemplatesPre-built report templates (GA4, Ads, etc.) InteractivityDate ranges, filters, drill-downs SchedulingAutomated email delivery of reports Calculated FieldsCustom metrics + dimensions (formula editor)

πŸ€– AI / Machine Learning

19. Vertex AI

🧠 Unified ML Platform

MLOps
Data SourcesBigQuery, GCS, Pub/Sub
Feature StoreCentralized feature engineering + serving
AutoMLNo-code training
Custom TrainingYour containers / TF / PyTorch
Model RegistryVersion, manage, compare models
Online PredictionLow-latency endpoints
Batch PredictionHigh-throughput, async
Model MonitoringDrift detection, skew, feature attribution

🧩 Components

ComponentDescription
AutoMLTrain models without code, image, text, tabular, video
Custom TrainingBring your own container / pre-built (TF, PyTorch, XGBoost)
PipelinesKubeflow / TFX-based ML workflow orchestration
Feature StoreCentralized feature management + online/offline serving
Model RegistryVersion control, metadata, lineage
EndpointsDeploy models for real-time or batch serving
ExperimentsTrack + compare training runs
TensorBoardManaged training visualization
Vector SearchNearest-neighbor search (embeddings at scale)

πŸͺ„ AutoML Capabilities

DomainTasks
VisionImage classification, object detection, segmentation
NLPSentiment analysis, entity extraction, classification
TablesStructured data, regression, classification, forecasting
VideoClassification, object tracking, action recognition

AutoML handles data preprocessing, architecture search, hyperparameter tuning, and deployment automatically.

20. Pre-trained AI APIs

πŸ”Œ Ready-to-Use AI, REST API Calls, No ML Expertise Needed

APIInputOutputUse Case
Vision AIImageLabels, OCR, faces, landmarks, objectsImage tagging, document scanning, moderation
Natural LanguageTextSentiment, entities, syntax, categoriesReview analysis, content classification
Speech-to-TextAudioTranscription (125+ languages)Subtitles, voice commands, call center
Text-to-SpeechTextAudio (380+ voices, 50+ languages)Accessibility, IVR, audiobooks
TranslationTextTranslated text (130+ languages)Localization, real-time translation
Video IntelligenceVideoLabels, shots, objects, text, facesMedia analysis, content moderation
Document AIDocument (PDF/image)Structured data, entities, tablesInvoice processing, form parsing, ID verification
DialogflowText / AudioIntent, entities, responseChatbots, IVR, virtual agents

All APIs: REST + client libraries (Python, Java, Go, Node.js). Pay per request, no infrastructure to manage.

21. Gemini

✨ Gemini, Multimodal Foundation Model

GenAI
ModelCapabilityContextBest For
Gemini UltraMost capable, complex reasoning1M+ tokensAdvanced research, multi-step reasoning
Gemini ProBalanced performance/cost1M tokensGeneral tasks, enterprise applications
Gemini FlashFastest, most cost-efficient1M tokensHigh-volume, latency-sensitive tasks
MultimodalText + Image + Audio + Video + Code in a single prompt Context Window1M+ tokens, process entire codebases, long documents Function CallingStructured tool use, connect to APIs and databases GroundingGoogle Search grounding, real-time factual responses Code GenerationGenerate, explain, debug code in 20+ languages ReasoningChain-of-thought, multi-step problem solving

πŸ”— Integrations

Vertex AIGemini API on GCP, enterprise security, VPC, audit logs AI StudioWeb-based prompt IDE, prototype rapidly Gemini APIDirect REST/SDK access for developers WorkspaceGemini in Docs, Sheets, Gmail, Meet, Slides Code AssistIDE integration, code completion, explanation, generation Cloud ConsoleNatural language queries for GCP resources

🎯 Use Cases

ChatbotsCustomer support, internal assistants Code ReviewAutomated PR reviews, bug detection Document AnalysisSummarize contracts, extract key terms Image UnderstandingDescribe images, extract text, answer questions SummarizationMeeting notes, article summaries, email triage TranslationContext-aware, nuanced translation Data AnalysisNatural language to SQL, chart interpretation

πŸ”’ Security

22. IAM, Identity & Access Management

πŸ” IAM Policy Model

WHOMember (identity)
+
WHATRole (permissions)
+
WHEREResource (scope)
Policy Bindinge.g. [email protected] + roles/storage.admin + bucket-xyz

IAM policies are additive, if ANY policy grants a permission, it's allowed. There are no explicit "deny" policies (use deny policies for exceptions).

πŸ‘€ Member Types

TypeFormatDescription
Google Accountuser:[email protected]Individual person
Service AccountserviceAccount:[email protected]…Identity for apps/services
Google Groupgroup:[email protected]Collection of accounts
Workspace Domaindomain:company.comAll accounts in domain
allAuthenticatedUsers,Any logged-in Google account
allUsers,Anyone on the internet (public)

🎭 Role Types & Inheritance

TypeExampleDetails
Basic AvoidOwner, Editor, Viewer1000s of permissions, too broad
Predefined Recommendedroles/storage.objectViewerPer-service, fine-grained
Customroles/myCustomRoleYou define exact permissions
Organization Policies inherited by all below
Folder Dept / Environment grouping
Project Isolation + billing boundary
Resource VM, bucket, dataset, etc.

βœ… Best Practices

Least PrivilegeGrant only what's needed, nothing more Use GroupsAssign roles to groups, not individuals Avoid Basic RolesEditor/Owner are dangerously broad Use PredefinedGoogle maintains per-service roles Policy AnalyzerAudit: who has access to what? IAM ConditionsConditional access by time, resource type, IP IAM RecommenderAuto-suggests role downgrades for unused permissions Deny PoliciesExplicit deny, override any allow (new feature)
23. Service Accounts

πŸ€– Identity for Services

TypeCreated ByExample
DefaultAuto (GCE, App Engine)PROJECT_NUM-compute@…
User-managedYou[email protected]
Google-managedGoogleInternal agents (cloud services)

Service accounts are both an identity (authenticate as SA) and a resource (grant others access to impersonate it).

πŸ”‘ Authentication Methods

JSON KeysDownloadable key file, avoid if possible Workload Identity FederationNo keys! Federate from AWS, Azure, GitHub, OIDC Workload Identity (GKE)Map K8s SA ↔ GCP SA, no keys ImpersonationUser/SA acts as another SA, audit trail Short-lived CredentialsTemporary tokens (1hr default) via STS Attached SAVM/Cloud Run/Functions, auto-injected credentials

βœ… Best Practices

Don't Use Default SACreate dedicated SAs per service Don't Download KeysUse Workload Identity or attached SAs Workload IdentityFor GKE, GitHub Actions, AWS, Azure workloads IAM ConditionsRestrict by time, resource, IP Disable UnusedDisable SAs not used for 90+ days Key RotationIf you must use keys, rotate regularly AuditPolicy Analyzer + Activity Analyzer for SA usage
24. KMS & Secret Manager

πŸ”‘ Cloud KMS, Key Management

Key RingLogical grouping (per region)
Crypto KeyEncryption key (AES, RSA, EC)
Key Versionv1, v2, v3… (rotate without re-encrypt)
Encryption LevelWho Manages KeyDetails
Google DefaultGoogleAuto, AES-256, no config needed
CMEKCustomer (in KMS)Your key, Google's HSM
CSEKCustomer (external)You supply key per-request
EKMExternal KMSKey never leaves your premises

🀫 Secret Manager

WhatStore API keys, passwords, certificates, tokens VersioningImmutable versions, enable/disable/destroy IAMFine-grained access per secret Auto-rotationCloud Functions trigger on rotation schedule Regional/GlobalChoose replication policy Cloud RunMount as env var or volume Cloud FunctionsDirect reference in config GKESecret Store CSI Driver integration

πŸ›‘οΈ Encryption Layers

Encryption at RestAES-256, all data on disk automatically encrypted
Encryption in TransitTLS 1.3, all data between services + to clients
Encryption in UseConfidential VMs, data encrypted in memory (AMD SEV)

Confidential Computing: data stays encrypted even during processing, trusted execution environments (TEEs).

25. Security Command Center

πŸ›‘οΈ Security Command Center, Unified Security Dashboard

Standard Tier (Free)

  • Security Health Analytics
  • Web Security Scanner (basic)
  • Anomaly detection
  • Asset inventory
vs

Premium Tier

  • All Standard features
  • Event Threat Detection
  • Container Threat Detection
  • VM Threat Detection
  • Compliance (CIS, PCI, NIST, ISO)
  • Attack path simulation

πŸ” Capabilities

Security Health AnalyticsDetect misconfigs (public buckets, open firewalls, etc.) Event Threat DetectionLog analysis, brute force, crypto mining, data exfil Container Threat DetectionDetect malicious binaries, reverse shells in GKE Web Security ScannerXSS, mixed content, outdated libraries ComplianceMap findings to CIS Benchmarks, PCI DSS, NIST 800-53 Attack PathSimulate attack paths to high-value resources

πŸ”„ Findings Pipeline

GCP Resources
β†’
SCC Scanners
β†’
Findings
β†’
Notifications (Pub/Sub)
β†’
SIEM / SOAR
ExportBigQuery for custom analytics / dashboarding Pub/SubReal-time alerting + SIEM integration ChronicleGoogle's SIEM, native SCC integration Mute RulesSuppress known-good findings (reduce noise)
26. Organization Policy & VPC Service Controls

πŸ“‹ Organization Policies

Constraints applied at org, folder, or project level, guardrails for the entire cloud.

PolicyEffect
Restrict VM external IPsVMs can't have public IPs
Restrict resource locationsOnly allow us-central1, europe-west1
Disable serial port accessBlock VM serial console login
Disable SA key creationNo downloadable service account keys
Restrict shared VPC projectsControl who can attach to shared VPC
Uniform bucket-level accessForce IAM-only (no ACLs) on GCS

🏰 VPC Service Controls

API RequestUser or service calling GCP API
Access Level CheckIP range, device policy, identity, geo
Service PerimeterBoundary around GCP projects + services
GCP APIBigQuery, GCS, Pub/Sub, etc., data stays inside perimeter
PreventsData exfiltration, even if IAM is misconfigured Perimeter BridgesAllow controlled sharing between perimeters Dry RunTest policies before enforcement Ingress/EgressFine-grained rules for cross-perimeter access

πŸ”§ DevOps

27. Cloud Build

πŸ—οΈ CI/CD Pipeline

Source (GitHub / CSR / Bitbucket)
β†’
Trigger (push / PR / tag)
β†’
Build Steps
β†’
Artifact Registry
β†’
Deploy (Cloud Run / GKE)
cloudbuild.yamlsteps: [(name: 'gcr.io/cloud-builders/docker', args: ['build', '-t', '...'])]
Step 1: Builddocker build
Step 2: Testgo test / npm test
Step 3: Pushdocker push
Step 4: Deploygcloud run deploy

βš™οΈ Features

TriggersPush, PR, tag, manual, Pub/Sub, webhook ConfigYAML (cloudbuild.yaml) or Dockerfile Parallel StepsRun steps concurrently with waitFor Worker PoolsPrivate, run in your VPC Buildersdocker, gcloud, kubectl, terraform, maven, gradle, npm Substitutions$BRANCH_NAME, $COMMIT_SHA, custom vars ApprovalManual approval gates for production deploys

πŸ’° Pricing & Integration

Free Tier120 build-minutes/day (e2-standard-1) Machine Typese2-standard-1, e2-highcpu-8, e2-highcpu-32 Artifact RegistryPush images/packages directly Cloud DeployContinuous delivery to GKE/Cloud Run Binary AuthorizationAttest builds for trusted deployment SLSASupply chain security levels (provenance)
28. Artifact Registry

πŸ“¦ Multi-Format Repository

FormatEcosystemExample
DockerContainersus-docker.pkg.dev/proj/repo/img:tag
Maven / GradleJavaJava libraries and apps
npmNode.jsJavaScript packages
pip (PyPI)PythonPython packages
GoGo modulesGo dependencies
Apt / YumOS packagesDebian / RPM packages

πŸ”§ Features

Vulnerability ScanningAutomatic CVE detection for container images IAM AccessFine-grained per-repo permissions Regional / Multi-regionalStore close to your compute Cleanup PoliciesAuto-delete old tags/versions by age Immutable TagsPrevent tag overwriting (production safety) Virtual ReposProxy upstream + private repos in one endpoint Remote ReposCache Docker Hub, Maven Central, npm registry SBOMSoftware bill of materials generation
29. Cloud Deploy

🚒 Managed Continuous Delivery

Artifact (image)
β†’
Release
β†’
Dev
β†’
Staging
β†’
Prod
Delivery Pipeline (YAML)Defines ordered targets + strategy
Dev TargetAuto-promote
Staging TargetApproval gate
Prod TargetCanary β†’ Full rollout

βš™οΈ Features

Pipeline as CodeYAML-defined delivery pipelines + targets RollbackOne-click rollback to previous release Canary DeploysProgressive, 10% β†’ 50% β†’ 100% TargetsGKE clusters, Cloud Run services ApprovalManual approval workflows per target Deploy HooksPre/post deploy actions (verify, test) Automation RulesAuto-promote on success, auto-rollback on failure Parallel DeploysDeploy to multiple targets simultaneously
30. Infrastructure as Code

πŸ“ Terraform vs Deployment Manager

FeatureTerraform RecommendedDeployment Manager
LanguageHCL (HashiCorp)YAML + Jinja/Python
Multi-cloudβœ… (AWS, Azure, GCP, +1000 providers)GCP only
StateRemote (GCS bucket) or Terraform CloudGoogle-managed
ModulesRich registry + custom modulesTemplates
CommunityMassive ecosystemLimited
Plan/Previewterraform plan (diff before apply)Preview API
StatusActively developedMaintenance mode

πŸ”§ Terraform on GCP

Providergoogle + google-beta (hashicorp/google) Authgcloud auth application-default login State BackendGCS bucket (locking with versioning) Modulesterraform-google-modules (Google official) WorkspacesIsolate dev/staging/prod state ImportImport existing GCP resources into state Terraform CloudRemote execution, policy-as-code (Sentinel)

πŸ”€ Other IaC Tools

Config ConnectorK8s CRDs for GCP resources, GitOps-native IaC PulumiTypeScript, Python, Go, C#, real programming languages CrossplaneK8s-native universal control plane for any cloud CDK for TerraformUse TypeScript/Python to generate Terraform

Config Connector: ideal if you already run GKE and prefer GitOps workflows.


πŸ“‘ Operations (Observability)

31. Cloud Monitoring

πŸ“Š Observability Flow

GCP Resources
β†’
Metrics
β†’
Cloud Monitoring
β†’
Dashboards + Alerts
β†’
Notifications
Email Slack PagerDuty Pub/Sub Webhooks SMS

βš™οΈ Features

Built-in Metrics1500+ metrics for all GCP services Custom MetricsWrite your own via API or OpenTelemetry Uptime ChecksHTTP, TCP, HTTPS from global locations SLO MonitoringDefine SLIs + SLOs, track error budgets Alerting PoliciesConditions + notification channels + documentation DashboardsCustom dashboards with charts, gauges, tables Metrics ExplorerAd-hoc queries and visualization PromQLQuery metrics using Prometheus syntax

πŸ”— Advanced

MQLMonitoring Query Language, powerful filtering + aggregation Multi-projectMetrics scopes, single pane across projects PrometheusManaged Prometheus (GMP), scrape + PromQL GrafanaUse Grafana with Cloud Monitoring datasource Service MonitoringAuto-detect App Engine, GKE, Cloud Run services Ops AgentUnified agent for metrics + logs on GCE VMs
32. Cloud Logging

πŸ“ Logging Pipeline

GCE
GKE
Cloud Run
Cloud Functions
App Engine
Custom Apps
Cloud LoggingIngest, index, analyze
Log Router (Sinks)Include/exclude filters β†’ route to destinations
Cloud StorageLong-term archive
BigQuerySQL analytics
Pub/SubStreaming / SIEM
Splunk / ChronicleExternal SIEM

βš™οΈ Features

Log ExplorerSearch, filter, analyze logs in real-time Log-based MetricsCreate custom metrics from log patterns Log-based AlertsAlert when specific log entries appear Retention30 days default (_Default bucket), configurable up to 3650 days Exclusion FiltersDrop noisy logs before ingestion (save cost) Log RouterRoute logs to different sinks with filters Log BucketsCustom storage with per-bucket retention Log AnalyticsSQL-like queries on log data (BigQuery-powered)

πŸ“‹ Log Types

TypeSourceDetails
Platform LogsGCP servicesAuto-generated (GCE, GKE, Cloud SQL…)
User LogsYour applicationsStdout/stderr, logging client libraries
Audit LogsAdmin + Data AccessWho did what, when, where
Access TransparencyGoogle staffWhen Google accesses your data

Admin Activity audit logs: always on, free, 400-day retention. Data Access: must enable, chargeable.

33. Trace, Profiler & Error Reporting

πŸ” Cloud Trace, Distributed Tracing

WhatDistributed tracing, track requests across services ProtocolOpenTelemetry (recommended), Zipkin, Cloud Trace API Auto-instrumentedApp Engine, Cloud Run, Cloud Functions AnalysisLatency distribution, bottleneck identification Trace ExplorerSearch traces by latency, service, status IntegrationLink traces ↔ logs ↔ metrics for full context

πŸ”₯ Cloud Profiler

WhatContinuous CPU + memory profiling in production Overhead< 0.5%, safe for production VisualizationInteractive flame graphs LanguagesJava, Go, Python, Node.js CompareSide-by-side profiles across versions/time CostFree

🚨 Error Reporting

WhatAggregate + display errors across GCP services GroupingAuto-group similar errors by stack trace Stack TracesFull stack traces with source context NotificationsEmail / mobile alerts on new errors IntegrationCloud Logging, errors auto-detected from logs LanguagesJava, Python, Go, Node.js, .NET, Ruby, PHP ResolutionMark errors as acknowledged / resolved / muted

πŸ›οΈ Management

34. Resource Hierarchy

πŸ—οΈ GCP Resource Hierarchy

Organizationcompany.com, top of hierarchy (Workspace/Cloud Identity domain)
Folder: EngineeringDept grouping
Folder: FinanceDept grouping
Folder: Dev
Folder: Staging
Folder: Prod
Project: web-app-dev
Project: api-staging
Project: api-prod
GCE VMs
GCS Buckets
BigQuery Datasets
Cloud SQL Instances

IAM policies + org policies inherit downward. A policy set at the org level applies to every resource below it.

πŸ“š Key Concepts

OrganizationRoot node, linked to Workspace/Cloud Identity domain FoldersOptional grouping, up to 10 nesting levels ProjectsIsolation boundary: own IAM, billing, APIs, quotas Project IDGlobally unique, immutable once created Project NumberAuto-assigned, used internally LabelsKey-value metadata on resources (for billing, filtering) Resource ManagerAPI to manage org/folders/projects programmatically

βœ… Best Practices

One OrgSingle org for all company resources Folder by Dept + EnvEngineering/Finance β†’ Dev/Staging/Prod Project per ServiceSeparate projects for each app/microservice Labelsteam, env, cost-center, app, for cost tracking Shared VPCCentralize networking in a host project Policy InheritanceSet org-wide policies at top, override at lower levels
35. Billing & Cost Management

πŸ’³ Billing Structure

Billing AccountPayment method + invoicing
Project A
Project B
Project C
BudgetsSet thresholds + email/Pub/Sub alerts Cost BreakdownBy project, service, SKU, label Billing ExportExport to BigQuery for custom analysis CUDsCommitted use discounts (1yr/3yr) SUDsSustained use discounts (auto-applied)

πŸ’‘ Cost Optimization

Right-size VMsUse Recommender to downsize underutilized VMs Spot / PreemptibleUp to 91% savings for fault-tolerant workloads CUDsCommit for stable workloads (37–55% off) AutoscalingScale down during low traffic Storage LifecycleAuto-transition to colder storage classes BQ Flat-rateCapacity pricing for predictable BQ costs Network EgressKeep traffic intra-region when possible Idle ResourcesDelete unused IPs, disks, LBs, snapshots

πŸ› οΈ Tools

Cost ManagementDashboard with spend trends + forecasts Recommender APIRight-sizing, idle resource cleanup suggestions Active AssistUmbrella for all recommendation engines FinOps HubCentralized cost visibility + governance Pricing CalculatorEstimate costs before deploying Committed Use AnalysisAnalyze CUD coverage + utilization
36. GCP Quick Reference, Master Table

πŸ“– Every GCP Service at a Glance

CategoryServiceTypeUse Case
ComputeCompute EngineIaaS (VMs)Custom VMs, lift-and-shift, HPC
Cloud FunctionsFaaS (Serverless)Event-driven functions, webhooks
Cloud RunCaaS (Serverless)Containerized APIs, microservices
App EnginePaaSFull web apps, rapid deployment
GKECaaS (Managed K8s)Complex platforms, multi-service orchestration
Storage & DBCloud Storage (GCS)Object StorageFiles, backups, data lake, static hosting
Cloud SQLManaged RDBMSMySQL, PostgreSQL, SQL Server workloads
Cloud SpannerGlobal RDBMSGlobal apps, finance, 99.999% SLA
FirestoreNoSQL DocumentMobile/web apps, real-time sync
Cloud BigtableNoSQL Wide-columnIoT, time-series, analytics (petabyte-scale)
MemorystoreIn-memoryCaching, sessions, leaderboards
AlloyDBManaged PostgreSQLHigh-perf OLTP + OLAP, AI workloads
Data & AnalyticsBigQueryData WarehouseSQL analytics, ML, petabyte-scale queries
Pub/SubMessagingEvent streaming, decoupling, fan-out
DataflowStream/Batch (Beam)ETL pipelines, real-time processing
DataprocManaged Spark/HadoopExisting Spark jobs, big data processing
Cloud ComposerWorkflow (Airflow)DAG orchestration, batch scheduling
Looker / Looker StudioBI / DashboardsEnterprise BI, self-service dashboards
AI / MLVertex AIML PlatformTrain, deploy, manage ML models (AutoML + custom)
Pre-trained AI APIsAI APIsVision, NLP, Speech, Translation, no ML skills
GeminiFoundation ModelMultimodal GenAI, chat, code, analysis
SecurityIAMAccess ControlWho can do what on which resource
Service AccountsMachine IdentityIdentity for apps, VMs, CI/CD
KMS + Secret ManagerKey / Secret MgmtEncryption keys, API secrets, certs
Security Command CenterSecurity PostureVulnerabilities, threats, compliance
Org Policy + VPC-SCGovernanceGuardrails, data exfiltration prevention
DevOpsCloud BuildCI/CDBuild, test, deploy automation
Artifact RegistryPackage RepositoryDocker images, npm, pip, Maven packages
Cloud DeployContinuous DeliveryProgressive rollout to GKE / Cloud Run
Terraform / Config ConnectorIaCProvision infrastructure as code
OperationsCloud MonitoringMetrics + AlertsDashboards, SLOs, uptime checks, alerts
Cloud LoggingLog ManagementIngest, search, route, analyze logs
Trace / Profiler / Error ReportingAPMDistributed tracing, profiling, error tracking
NetworkingVPCVirtual NetworkIsolated network, subnets, firewall rules
Cloud Load BalancingLoad BalancerGlobal/regional L4/L7 traffic distribution
Cloud CDNCDNCache content at Google edge locations
Cloud DNSDNSManaged authoritative DNS
Cloud Interconnect / VPNHybrid ConnectivityConnect on-prem to GCP (dedicated or VPN)
ManagementResource HierarchyOrganizationOrg β†’ Folders β†’ Projects β†’ Resources
BillingCost ManagementBudgets, alerts, cost optimization
Active Assist / RecommenderOptimizationRight-sizing, idle cleanup, security fixes