Skip to content
View Harshetjain666's full-sized avatar
🏠
🏠

Block or report Harshetjain666

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Harshetjain666/README.md

Harshet Jain.

Senior SRE · Platform Engineer · AWS Community Builder

Four years keeping multi-cloud Kubernetes alive under enterprise SLAs with penalty clauses — EKS on AWS, GKE on GCP, 100+ resources across isolated AWS Organization accounts.


LinkedIn  Medium  Twitter  Email

infrastructure network

"I don't just keep the lights on. I make sure the building was designed so the lights can't go off."

@ Qubole (Nasdaq-listed big data platform) · multi-cloud production · enterprise SLAs


What I've shipped

i Cut cloud spend 30% (~$60K/year) across AWS + GCP — per-service dashboards, reserved capacity, spot for batch
ii MTTR 60 min → under 20 — 15+ PagerDuty-wired runbooks. Traced a live regression, rolled back in 18 minutes
iii SOC2 Type II in 6 months — IRSA, Workload Identity, Trivy + Snyk in CI. Zero long-lived credentials. Zero findings
iv Zero-downtime MySQL 5.7 → 8.0 — blue-green deploys, DMS lag monitoring, two weeks parallel validation
v P0 EKS node failure recovered in <30 min — drained nodes, shifted ALB weights mid-enterprise-traffic
vi Argo Rollouts canary — bad deploys caught at 5% traffic, auto-rolled back. Weekly batches → multiple daily releases
vii <2 min RTO on AZ failure — quarterly DR drills via AWS FIS. Chaos surfaced 4 failure modes before production did

4 yrs production SRE $60K saved / year 99.9% SLO · 10+ services <20 min MTTR 65% CVEs reduced

Stack

AWS GCP Kubernetes Terraform Prometheus Grafana ArgoCD Ansible Python Go Bash


Featured work

terraform-aws-eks-fargate-cluster    ★ 32   ⑂ 59

Production-ready Terraform module — EKS with Fargate, VPC, IRSA, RBAC wired from day one. Used by teams who don't want to start from scratch. Battle-tested at Qubole.


Writing

SRE war stories, cloud cost engineering, and infrastructure deep-dives — harshetjain.medium.com →




AWS Certified Solutions Architect  ·  RHCE  ·  RHCSA  ·  Red Hat Containers & Kubernetes

AWS Community Builder  ·  Qubole, New Delhi  ·  open to async-first remote roles worldwide


Pinned Loading

  1. terraform-aws-eks-fargate-cluster terraform-aws-eks-fargate-cluster Public

    Source code of my AWS EKS with fargate cluster setup

    HCL 32 59

  2. DevOps-Groovy-Code DevOps-Groovy-Code Public

    Source code of my DevOps article.

    Groovy 4 2

  3. Monitoring-prometheus-grafana Monitoring-prometheus-grafana Public

    Setup Prometheus and Grafana On Kubernetes And Make Persistent Using ConfigMap

    2 2

  4. terraform-code-management terraform-code-management Public

    Source code of my terraform management article

    HCL 2 4

  5. Ansible-Jenkins-pipeline Ansible-Jenkins-pipeline Public

    Source code of Jenkins pipeline with ansible automation

    HTML 1 6

  6. aws-terraform-efs aws-terraform-efs Public

    Terraform Integration With AWS Using EFS

    HTML 1