Back to all posts

Google Compute: Optimising Costs and Performance in Google Cloud Platform


Step 01: Understanding Sustained Use Discounts in GCP – Google Cloud Platform

Sustained use discounts

Automatic discounts for running VM instances for significant portion of the billing month

  • Example: If you use N1, N2 machine types for more than 25% of a month you get 20% to 50% discount on very incremental minute.
  • Discount increses with usage.
  • No action required on your part!

Applicable for instances created by Google Kubernetes Engine and Compute. Engine

RESTRCITION

  • Does not apply on certain machine types (ex: E2 and A2)
  • Does not apply to VMs created by App Engine flexible and Dataflow


Step 02 – Understanding Committed Use Discounts in GCP – Google Cloud Platform

Committed use discounts

  • For workloads with predictable resource needs
  • Commit for 1 year or 3 years
  • Up to 70% discount based on machine type and GPUs
  • Applicable for instances created by Google Kubernetes Engine and Compute Engine
  • Does not apply to VMs create by App Engine flexible and Dataflow
Step 03 – Saving Costs with Preemptible VMs

Preemptible VM

Short-lived cheaper ( upto 80%) compute instances

  • Can be stopped by GCP any time (preempted) within 24 hours
  • Instances get 30 second warning (to save anything they want to save)

Use Preempt VM’s if:

  • Your applications are fault tolerant
  • You can very cost senstive
  • Your workload is not immediate
  • Example: Non immediate batch processing jobs

RESTRICTIONS:

  • NOT always available
  • No SLA and CANNOT be migrated to regular VMs
  • No Automatic Restrats
  • Free Tier credits not applicable
Spot VMs

Spot VMs

  • Spot VMs: Latest version of preemptible VMs
  • Key Difference: Does not have a maximum runtime
    • Compared to traditional preemptible VMs which have a maximum runtime of 24 hours
  • Other features similar to traditional preemptible VMs
Step 04 – understanding Billing for Google Compute Engine – GCP VMs

Google Compute Engine – Billing

  • Your are billed by the second (After a minimum of 1 minute)
  • Your are NOT billed for compute when a compute instance is stopped
    • However, you will be billed for nay storage attached with in!
  • (RECOMMENDATION) Always create Budget alerts and make use of Budget export to stay on top of billing!
  • What are ways you can save money?
    • Choose the right machine type and image for your workload
    • Be aware of discount available
      • Sustained use discount
      • Committed use discount
      • Discount for preemptible VM instance
Step 05 – Achieving High Availability with Live Migration and Automatic Restart

Compute Engine: Live Migration & Availability Policy

How do you keep your VM instances running when a host system needs too be updated (a software or a hardware update needs to be performed)?

Live Migration

  • Your runing instance is migrated to another host in the some zone
  • Does NOT change any attributes or properties of the VM
  • SUPPORTED for instances with local SSDs
  • Not supported for GPUs and preemptible instances

Important Configuration – Availability Policy:

  • On host maintenance: What should happen during periodic infrastructure maintenance?
  • Automatic restart – Restart VM instance if they are terminated. due to non-user initiated reasons (maintenance event, hardware failure etc.)
Step 06: Understanding Custom Machine Types

Compute Engine Features: Custom Machine Types

  • What do you do when predefined VM options are not appropriate for your workload?
    • Create a machine type customised to your needs (a Custom Machine Type)
  • Custom Machine Type: Adjust vCPUs, memory and GPUs
    • Choose between E2, N2, or N1 machine types
    • Support a wide variety of Operating Systems: CentOS, CoreOS, Debian, RedHat, Ubuntu, Window etc
  • Billed per vCPUs, memory provisioned to each instance
    • Example Hoursly Price: $0.033174 / vCPU + $0.004446 /GB
Step 07: Exploring GPUs in Google Compute Engine – GCE

Compute Engine Features: GPUs

How do you accelerate math intensive and graphics-intensive workloads for AI/ML etc?

  • Add a GPU to your virtual machine:
    • High performance for math intensive and graphics-intensive workloads
    • Higher Cost
    • (REMEMBER) Use images with GPU libraries (Deep Learning) installed
      • OTHERWISE, GPU will not be used
    • GPU restrictions:
      • NOT supported on all machine types (For example, not supported on shared-core or memory optimised machine types)
      • On host maintenance can only have the value “Terminate VM instance”
  • Recommended availability policy for GPUs
    • Automatic restart – on
Step 08: Virtual Machine – Remember

Virtual Machine Quick Review

  • Associated with a project
  • Machine type availability can vary from region to region
  • You can only change the machine type (adjust the number of vCPUs and memory) of a stopped instance
    • You CANNOR change the machine type of a running instance
  • VM’s can be filtered by various properties
    • Name, Zone, Machine Type, internal/External IP, Network, Labels etc
  • Instances are Zonal (Run in a specific zone (in a specific region))
    • Images are global (you can provide access to other project – if needed)
    • Instances templates are global (Unless you use zonal resource in your templates)
  • Automatic Basic Monitoring is enabled
    • Default Metrics: CPU utilisation, Network Bytes (in/out), Disk throughput/IOPS
    • For Memory Utilisation & Disk Utilisation – Cloud Monitoring agent is needed
Step 09: Best Practices with VM

Virtual Machine – Best Pratice

  • Choose Zone and Region based on:
    • Cost, Regulation, availability Needs, Latency and specific Hardware needs
    • Distribute instance in multiple zone and regions for high availability
  • Choose right machine type for you needs:
    • Play with them to find out the right machine type
    • Use GPUs for math and Graphic intensive applications
  • Reserve for “committed use discounts” for constant workloads
  • Use preemptible instances for fault-tolerant, NON time critical workloads
  • Use labels to indicate environment, team, Business unit etc
Step 10: Scenarios – Virtual Machines in Google Cloud Platform

Compute Engine Scenarios

  1. You want dedicated hardware for your compliance, licensing, and management needs
    • Answer: Sole-tenant nodes
  2. I have 1000s of VM and I want to automate OS patch management, OS inventory management and OS configuration management (manage software installed)
    • Answer: Use “VM Manager”
  3. You want to login to your VM instance to install software
    • Answer: You can SSH into it
  4. You do not want to expose a VM to internet
    • Answer: Do NOT assign an external IP address
  5. You want to allow HTTP traffic to your VM
    • Configure Firewall Rules