Simplest way to deploy and scale your application in GCP. Provide end-to-end application management
Supports:
- GO, Java, .NET, Node.js, python, Ruby using pre-configured runtimes.
- Use custom run-time and write code in any language
- Connect to veritey of Google Cloud storage products (Cloud SQL etc)
No usage charges – Pay for resources provisioned
Features:
- Automatic load balancing & Auto scaling
- Managed platform updates & Application health monitoring
- Application versioning
- Traffic splitting
App Engine Environments
- Standard Environment – Preconfigured runtimes with automatic scaling
- Flexible Environment – Customizable VMs with more control.
Standard Environment
Best for: Small to medium apps with rapid scaling needs
✅ Key Features:
- Fully managed – No server management required
- Automatic scaling – Scales to zero when idle (cost-efficient)
- Fast deployment – Quick startup and deployment times
- Limited customization – Uses predefined runtime environments
✅ Supported Languages:
- Node.js, Python, Java, Go, PHP, Ruby
✅ Billing Model:
- Pay only for what you use (compute, bandwidth, storage)
Flexible Environment
Best for: Large applications requiring custom configurations
✅ Key Features:
- Customizable runtimes – Uses Docker containers for more flexibility
- More resource control – Choose CPU, memory, and disk size
- Supports background processes – Unlike Standard, it allows long-running processes
- Supports any language – Run any framework or language inside a container
✅ Billing Model:
- Always running – Even at zero load, it incurs some cost
✅ Use Cases:
- Apps requiring native libraries or custom dependencies
- Apps needing long-running background processes
- Apps needing custom Docker images
Application Component Hierarchy
Application: One App per project
Service(s): Multiple Microservces or App components
- You can have multiple services in a single application
- Each Service anc have different settings
- Earlier service(s) called Modules
Version(s): Each version associated with code and configuration
- Each Version can run in one or more instances
- Mutiple Versions can co-exist
- Options to rollback and split traffic
Key Differences: Standard vs. Flexible
| Feature | Standard Environment | Flexible Environment |
|---|---|---|
| Startup Time | Fast | Slower (uses VM) |
| Scaling | Auto-scales to zero | Auto/manual scaling (but always running) |
| Customization | Limited (predefined runtimes) | Full control (custom Docker images) |
| Networking | Uses Google’s internal load balancing | Uses Compute Engine networking |
| Cost Efficiency | Pay-per-use, scales to zero | Costs more (VMs run even when idle) |
| Statefulness | Stateless | Stateful possible |
| Supported Languages | Node.js, Python, Java, Go, PHP, Ruby | Any language via Docker |
Scaling Instances
Automatic – Automatically scale instances based on the load:
- Recommended for Continuously Running Workloads
- Auto scale based on:
- Target CPU Utilization – configure a CPU usage threshold
- Target Throughtput Utilization – Configure a throughput threshold
- Max Concurrent Requests – Configure max concurrent requests an instance can receive
- Auto scale based on:
- Configure Max Instances and Min instances
Basic – Instances are created as and when requests are received:
- Recommended for specific purpose (Adhoc) workloads
- Instances are shutdown if Zero requests
- Tries to keep costs low
- High latency is possible
- Not supported by App Engine Flexible Enviroment
- Configure Max Instances and Idle Timeout
- Instances are shutdown if Zero requests
Manual – Configure specific number of instances to run:
- Adjust number of instances manually over time.
app.yaml Reference
runtime: python28 #The name of the runtime environment that is used by your app
api_version: 1 #RECOMMENDED - Specify here - gcloud app deploy -v [YOUR_VERSION_ID]
instance_class: F1
service: service-name
#env: flex
inbound_services:
- warmup
env_variables:
ENV_VARIABLE: "value"
handlers:
- url: /
script: home.app
automatic_scaling:
target_cpu_utilization: 0.65
min_instances: 5
max_instances: 100
max_concurrent_requests: 50
#basic_scaling:
#max_instances: 11
#idle_timeout: 10m
#manual_scaling:
#instances: 5
Request Routing
You can use a combination of three approches:
- Routing with URLs:
https://PROJECT_ID.REGION_ID.r.appspot.com(default service called)https://SERVICE-dot-PROJECT_ID.REGION_ID.r.appspot.com(specific service)https://VERSION-dot-SERVICE-dot-PROJECT_ID.REGION_ID.r.appspot.com(specific version of service)- Replace -dot- with . if using custom domain
- Routing with a dispatch file:
- Configure dispatch.yaml with routes
- gcloud app deploy dispatch.yaml
- Routing with Cloud Load Balancing:
- Configure routes on Load Balancing instance
Deploying new version without Downtime
How do we go from V1 to V2 without downtime.
Option 1: If you confident – Deploy & shift all trafic at once:
- Deploy and shift all traffic at once from v1 & v2:
gcloud app deploy
Option 2: I want to manage the migration from V1 – V2
- Step 1: Deploy and shift all the traffic (–no-promote)
gcloud app deploy --no-promote
- Step 2: Shift traffic to V2:
- Option 1: (All at once Migration): Migration all at once to v2
gcloud app services set-traffic s1 --splits v2=1
- Option 2: (Gradual Migration): Gradually shift trafic to v2. Add
--migrateoption- Gradual migration is not supported by App Engine Flexibile Environment
- Option 3: (Spliting): control the pace of migration
gcloud app service set-traffic s1 --splits=v2=.5,v1=.5- Usefull to perform A/B testing
- Ensure the new instances are warmed up before they receive traffic (app.yaml – inbound_services > warmup)
- Option 1: (All at once Migration): Migration all at once to v2
Spliting traffic between multiple versions
How do you decide which version receives which traffic
- IP Spliting – Based on request IP address
- IP address can change causing accuracy issues! (I go from my house to coffee shop)
- If all requests originate from a corporate VPN with single Ip, this can cause al requests to go to the same version
- Cookie Splitting – Based on a cookie (GOOGAPPUID)
- Cookies can be controlled from your application
- Cookie splitting accurately assign users to versions
- Random – Do it randomly
How to do it?
- include –split-by option in gcloud app services set-traffic command
- Value must be one of: cookie, ip, random
gcloud app services set-traffic s1 --splits=v2=.5,v1=.5 --split-by=cookie