GCP: Instance Groups – Chethan S Poojary

Two Types of Instance Groups:

Managed: Identical VMs created using a template:
- Features: Auto scaling, auto healing & other services
Unmanaged: Different configurations for VMs in same group:
- Does NOT offer auto scaling, auto healing.
- Not recommended unless you need different kinds of VMs

Location can be Zonal or Regional

Regional gives you higher availability (RECOMMENDED)

Managed Instance Groups (MIG)

It’s identical VMs create using an instance template

Important Features:
- Maintain certain number of instances: If an instance crashes, MIG launches another instance.
- Detect application failures using health checks (Self Healing)
- Increase and decrease instance based on load (Auto Scaling)
- Add Load Balancer to distribute load
- Create instances in multiple zones (regional MIGs): regional MIGs provide higher availability compared to zonal MIGs
- Release new application versions without downtime
  - Rolling updates: release new version step by step (gradually). Update a percentage of instances to the new version at a time.
  - Canary Deployment: Test new version with a group of instances before releasing it across all instances.

Creating Managed Instance Group (MIG)

Instance template is mandatory
Configure auto-scaling to automatically adjust number of instances based on load:
- Minimum number of instances
- Maximum number of instances
- Autoscaling metrics: CPU Utilisation target or Load Balancer Utilisation target or any other metric from Stack Drivers
  - Cool-down period: How long wait before looking at autoscaling metrics again?
  - Scale in Controls: Prevent a sudden drop in number of instances
- Auto-healing: Configure a Health check with initial delay (How long should you wait for your app to initialise before running a health check)

Updating a Managed Instance Group (MIG)

Rolling update – Gradual update of instances in instance group to the new instance template.
Rolling Restart/replace: Gradual restart or replace of all instances in groups x

Instance Group Scenarios

Scenario	Solution
You want MIG managed application to survive Zonal Failures	Create multiple zone MIG (or regional MIG)
You want to create VMs of different configuration in the same group	Create Um-managed instance Group
You want to preserve VM state in an MIG	Stateful MIG – Preserve VM state (Instance name, attached Persistent disks and metadata). Recommended for stateful workloads (database, data processing apps)
You want high availability in an MIG even when there are hardware/software updated	Use an instance template with availability policy automatic restart: enabled & on-host maintenance: migrate Ensure live migration and automatic restarts
You want unhealthly instance to be automatically replaced	Configure health check on the MIG (self healing)
Avoid frequent scale up & downs	Cool-down period/initial delay

Playing with Managed Instance Groups – Basics

Create instance group: create

gcloud compute instance-groups managed

gcloud compute instance-groups managed create my-mig --zone us-central1-a --template=https://compute.googleapis.com/compute/v1/projects/learning-450716/regions/us-central1/instanceTemplates/ my-instance-templete-with-startup-script --size 1
// Options
- heath-check = HEALTH_CHECK: how do you decide if an instance is healthy?
- initial-delay: How much time should you give to an instance to start

Setup Autoscaling: set-autoscaling/stop-autoscaling

gcloud compute instance-group managed set-autoscaling my-mig --max-num-replicas=10

-cool-down-period (default: 60s): How much time should Auto Scaler wait after initiating an autoscaling action?
–scale-based-on-cpu –target-cpu-utilization –scale-based-on-load-balancing –target-load-balancing-utilization
–min-num-replicas –mode (off/on/default/only-scale-out)

gcloud compute instance-group managed stop-autoscaling my-mig

Updating existing MIG policies (Ex: auto healing policies):

gcloud compute instance-groups managed update my-mig

–initial-delay: how much time should you give to the instance to start before making it as unhealthy?
–health-check: How to you decide if an instance is healthy?

Gcloud and MIG – Making Updates

Resize the group:

gcloud compute instance-groups managed resize my-mig --size=5

Recreate one or more instances (Delete and recreate instances):

gcloud compute instance-groups managed recreate-instances my-mig --instances=my-instance-1,my-instance-2

Update specific instances:

gcloud compute instance-group managed update-instance my-mig --instances=my-instace-3, my-instance4 //(update specific instances from group)
// --minimal-action=none(default)/refresh/replace/restart
// --most-disruptive-allowed-action=none(default)/refresh/replace/restart

Update instance template:

gcloud compute instance-groups managed set-instance-template my-mig --template=v2-template

Rolling Actions

Scenario: You want to manage your new release -v1 to v2 – without downtime

gcloud compute instance-groups managed rolling-action

Restart (stop and start) – gcloud compute instance-groups managed rolling-action replace my-mig
- –max-surge=5 or 10% (Max no of instances updated at a time)
Replace (Delete & recreate) – gcloud compute instance-groups managed rolling-action replace my-pig
- –max-surge=5 or 10% (Max no of instances updated at a time)
- –max-unavailable = 5 or 10% (Max no of instances that can be down for the update)
- –replacement-method=recreate/substitute (substitute (default) create instances with new names. recreate reuse names)
Update instances to new template:
- Basic version: (Update all instances slowly step by step) – gcloud compute instance-groudp managed rolling-action start-update my-mig –version=v1-template
- Canary Version (update a subset of instances to v2) – gcloud compute instance-groups managed rolling-action start-update my-mig –version=template=v1-template –canary-version=template=v2-template, target-size=10%