As a key member of our Platform Engineering team, you'll own and evolve the systems that keep our infrastructure fast, reliable, and developer-friendly. You'll lead our observability, CI/CD, and GitOps initiatives, empowering engineers to ship confidently and our systems to scale intelligently. You'll work in a high-traffic, data-driven environment built on EKS, FastAPI, MySQL/ClickHouse, and Kafka.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
About the Role
As a key member of our Platform Engineering team, you'll own and evolve the systems that keep our infrastructure fast, reliable, and developer-friendly. You'll lead our observability, CI/CD, and GitOps initiatives — empowering engineers to ship confidently and our systems to scale intelligently. You'll work in a high-traffic, data-driven environment built on EKS, FastAPI, MySQL/ClickHouse, and Kafka.
Kubernetes · EKS · Prometheus · Grafana · ArgoCD · GitOps · GitHub Actions · Terraform · Pulumi · Docker · AWS · Python · Bash · Kafka · ClickHouse · MySQL · Redis · KeyDB · Cassandra · Linkerd · Istio · IaC · CI/CD
- Own and evolve our observability stack — from data collection through long-term retention and dashboarding.
- Implement GitOps workflows with ArgoCD, automating application lifecycles from commit to production.
- Maintain and scale EKS clusters — high availability, cost efficiency, and performance.
- Standardize environments, automate workflows, and reduce deployment friction.
- Define and enforce security best practices within CI/CD and AWS environments.
- Continuously improve monitoring, alerting, and incident response playbooks.
- Collaborate on data lake architecture supporting analytics and data engineering workflows.
Interested in remote work opportunities in Devops? Discover Devops Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
- Observability Expert: Deep Prometheus / Thanos / VictoriaMetrics, Grafana, PromQL, SLO design, dashboarding and alerting.
- K8S Expertise: Production workloads on Kubernetes or Amazon EKS — scaling, upgrades, cost management.
- GitOps Champion: ArgoCD, Flux, or similar tools managing production deployments.
- 5+ years of relevant SRE / Platform / DevOps experience.
- Experience managing stateful services: MySQL, KeyDB/Redis, ClickHouse, Cassandra.
- Terraform or Pulumi for Infrastructure as Code.
- Scripting with Python, Bash, or Go for automation.
- Data-heavy or real-time systems experience: Kafka, analytics platforms.
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
- Seamless, automated deployments across all environments.
- Actionable observability: metrics and alerts that detect issues before users do.
- Zero-downtime rollouts and predictable releases.
- Developers fully empowered to ship code quickly and safely.
Full-time. In-office in Glendale, CA or fully remote. Competitive compensation based on experience.
Similar Jobs
Explore other opportunities that match your interests
Senior DevOps Engineer
everops
nasscomm
Senior Microsoft 365 Migration Architect