[Bounty #524] Google Cloud (GKE) support for sifnode#3449
[Bounty #524] Google Cloud (GKE) support for sifnode#3449BetsyMalthus wants to merge 7 commits intoSifchain:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds initial scaffolding to support deploying sifnode on Google Cloud via Google Kubernetes Engine (GKE), including Terraform infrastructure, Helm values, a deployment script, and documentation.
Changes:
- Added a Terraform module to provision a GKE cluster, networking, IAM/service accounts, and Kubernetes storage resources.
- Added GKE-specific Helm values and a new
providervalue. - Added a
deploy_gke.shscript and a GKE deployment guide.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/gke-deployment.md | Documents GKE deployment steps (scripted + manual). |
| deploy/terraform/providers/gke/main.tf | Provisions GKE + network + firewall + K8s storage resources. |
| deploy/terraform/providers/gke/variables.tf | Defines configurable Terraform variables for the GKE module. |
| deploy/terraform/providers/gke/outputs.tf | Exposes cluster connection outputs and helper commands. |
| deploy/helm/sifnode/values.yaml | Introduces a provider value (aws default). |
| deploy/helm/sifnode/values_gke.yaml | Adds GKE-specific Helm overrides (storage class, service, resources, etc.). |
| deploy/gke/deploy_gke.sh | Automates create/destroy/update/status for GKE deployments. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | ||
| PROJECT_DIR="$(dirname "$SCRIPT_DIR")" | ||
|
|
||
| # ── Defaults ────────────────────────────────────────────────── | ||
| PROJECT_ID="${GCP_PROJECT_ID:-}" | ||
| REGION="${GCP_REGION:-us-central1}" | ||
| CLUSTER_NAME="${CLUSTER_NAME:-sifnode-gke}" | ||
| NAMESPACE="${NAMESPACE:-sifnode}" | ||
| DEPLOY_ENV="${DEPLOY_ENV:-gke}" | ||
| TERRAFORM_DIR="${PROJECT_DIR}/deploy/terraform/providers/gke" | ||
| HELM_CHART="${PROJECT_DIR}/deploy/helm/sifnode" | ||
| HELM_VALUES="${PROJECT_DIR}/deploy/helm/sifnode/values_gke.yaml" |
There was a problem hiding this comment.
PROJECT_DIR is computed as the parent of deploy/gke, which makes TERRAFORM_DIR/HELM_CHART point to ${repo}/deploy/deploy/... (non-existent). This will cause cd "$TERRAFORM_DIR" and Helm installs to fail when running the script from a fresh clone. Compute the repo root (e.g., two directories up from SCRIPT_DIR, or via git rev-parse --show-toplevel) and build paths from that.
| # Install with GKE-specific values | ||
| helm install sifnode deploy/helm/sifnode \ | ||
| --namespace sifnode \ | ||
| --create-namespace \ | ||
| --values deploy/helm/sifnode/values_gke.yaml \ | ||
| --set provider=gke \ | ||
| --set persistence.storageClass=gke-ssd |
There was a problem hiding this comment.
The Helm install path assumes deploy/helm/sifnode is a Helm chart, but in this repo that directory contains only values files and no Chart.yaml/templates, so helm install sifnode deploy/helm/sifnode will fail. Either add the missing Helm chart structure, or update the docs/script to reference the actual chart location.
|
|
||
| # GKE node configuration | ||
| nodeSelector: | ||
| cloud.google.com/gke-nodepool: sifnode-pool |
There was a problem hiding this comment.
nodeSelector.cloud.google.com/gke-nodepool is set to sifnode-pool, but the Terraform node pool is named ${cluster_name}-node-pool (e.g., sifnode-gke-node-pool). With this mismatch, pods may remain Pending due to unschedulable nodeSelector. Align the value with the actual node pool name, or remove the nodeSelector and make it configurable.
| cloud.google.com/gke-nodepool: sifnode-pool | |
| cloud.google.com/gke-nodepool: sifnode-gke-node-pool |
| description = "Map of authorized CIDR blocks for master access" | ||
| type = map(string) | ||
| default = { | ||
| "default" = "0.0.0.0/0" | ||
| } |
There was a problem hiding this comment.
Defaulting authorized_cidrs to 0.0.0.0/0 effectively opens the GKE control plane to the public internet when master_authorized_networks_config is enabled. Consider making this variable required (no default) or defaulting to a private/admin CIDR range, and document how to set it safely.
| description = "Map of authorized CIDR blocks for master access" | |
| type = map(string) | |
| default = { | |
| "default" = "0.0.0.0/0" | |
| } | |
| description = "Map of authorized CIDR blocks for master access. Must be explicitly set to trusted private/admin CIDR ranges; do not use 0.0.0.0/0." | |
| type = map(string) |
| description = "CIDR ranges allowed for SSH access" | ||
| type = list(string) | ||
| default = ["0.0.0.0/0"] |
There was a problem hiding this comment.
ssh_source_ranges defaults to 0.0.0.0/0, which opens SSH (port 22) on nodes to the public internet via the firewall rule. Use a safer default (e.g., empty list + no rule, or a placeholder CIDR that forces explicit configuration) and document how to set it.
| description = "CIDR ranges allowed for SSH access" | |
| type = list(string) | |
| default = ["0.0.0.0/0"] | |
| description = "CIDR ranges allowed for SSH access. Set this explicitly to trusted admin network CIDRs (for example in terraform.tfvars or via -var='ssh_source_ranges=[\"198.51.100.10/32\"]'); the default is a non-routable placeholder to avoid exposing SSH publicly." | |
| type = list(string) | |
| default = ["203.0.113.0/32"] |
| resource "google_project_iam_member" "sifnode_permissions" { | ||
| for_each = toset([ | ||
| "roles/container.admin", | ||
| "roles/compute.admin", | ||
| "roles/iam.serviceAccountUser", | ||
| "roles/storage.admin" | ||
| ]) | ||
| role = each.key | ||
| member = "serviceAccount:${google_service_account.sifnode.email}" | ||
| } |
There was a problem hiding this comment.
The node pool service account is granted broad project-level admin roles (container.admin, compute.admin, storage.admin). Because this SA is attached to every node, any pod compromise can escalate to full project compromise. Prefer least-privilege roles for node operation (logging/monitoring, pull images, etc.) and use Workload Identity for any in-cluster components that need elevated permissions.
| output "helm_install_command" { | ||
| description = "Helm install command for sifnode on GKE" | ||
| value = <<-EOT | ||
| # After configuring kubectl context: | ||
| ${google_container_cluster.primary.name} | ||
|
|
||
| # Install sifnode via Helm: | ||
| helm install sifnode deploy/helm/sifnode \\ | ||
| --namespace sifnode \\ | ||
| --create-namespace \\ |
There was a problem hiding this comment.
helm_install_command appears to output just the cluster name under "After configuring kubectl context" rather than the actual gcloud container clusters get-credentials ... command (you already have kubectl_config as a separate output). Update this heredoc to include the correct kubectl configuration command so the output is directly usable.
| resource "kubernetes_persistent_volume_claim" "sifnoded" { | ||
| metadata { | ||
| name = "sifnoded-pvc" | ||
| namespace = var.namespace | ||
| labels = var.tags | ||
| } | ||
| spec { |
There was a problem hiding this comment.
The PVCs are created in var.namespace, but Terraform never creates that namespace. On a new cluster, kubernetes_persistent_volume_claim will fail with "namespaces "sifnode" not found". Add a kubernetes_namespace resource (and ensure the PVCs depend on it), or move PVC creation entirely to Helm.
| resource "google_compute_disk" "sifnoded_data" { | ||
| name = "${var.cluster_name}-sifnoded-data" | ||
| type = "pd-ssd" | ||
| zone = var.default_zone | ||
| size = var.data_disk_size | ||
| labels = var.tags | ||
| } | ||
|
|
||
| resource "google_compute_disk" "sifnodecli_data" { | ||
| name = "${var.cluster_name}-sifnodecli-data" | ||
| type = "pd-ssd" | ||
| zone = var.default_zone | ||
| size = var.data_disk_size | ||
| labels = var.tags | ||
| } | ||
|
|
There was a problem hiding this comment.
google_compute_disk.sifnoded_data and google_compute_disk.sifnodecli_data are created but never referenced/attached (PVCs will dynamically provision their own PDs via the StorageClass). This will create unused disks (cost) and likely isn't doing what is intended. Remove these disks or explicitly wire them into Kubernetes PVs/claims.
| resource "google_compute_disk" "sifnoded_data" { | |
| name = "${var.cluster_name}-sifnoded-data" | |
| type = "pd-ssd" | |
| zone = var.default_zone | |
| size = var.data_disk_size | |
| labels = var.tags | |
| } | |
| resource "google_compute_disk" "sifnodecli_data" { | |
| name = "${var.cluster_name}-sifnodecli-data" | |
| type = "pd-ssd" | |
| zone = var.default_zone | |
| size = var.data_disk_size | |
| labels = var.tags | |
| } |
| variable "cluster_version" { | ||
| description = "GKE cluster version (use 'latest' for current default)" | ||
| type = string | ||
| default = "latest" | ||
| } | ||
|
|
There was a problem hiding this comment.
cluster_version is defined but never used in the module. Either wire it into google_container_cluster (e.g., min_master_version / node_version, or via release channel behavior) or remove the variable to avoid misleading configuration surface area.
| variable "cluster_version" { | |
| description = "GKE cluster version (use 'latest' for current default)" | |
| type = string | |
| default = "latest" | |
| } |
🎯 Closes #524
Summary
Implemented full Google Cloud Platform support for sifnode via Google Kubernetes Engine (GKE), following the existing AWS (EKS) deployment patterns.
Files Added/Modified
Key Features
deploy_gke.sh(create|destroy|update|status)Testing
Bounty
$1,000 USDC
Built by BetsyMalthus