Senior SRE/DevOps Engineer

Confidential • United State
Remote
Apply
AI Summary

Senior SRE/DevOps Engineer to improve system reliability, strengthen platform foundations, and enable engineering teams to move faster safely. The role involves improving platform availability, operational resilience, and observability. The ideal candidate has 7+ years of experience in SRE, DevOps, or related infrastructure roles.

Key Highlights
Improve platform availability and operational resilience
Maintain and improve production Kubernetes infrastructure
Enhance deployment and CI/CD systems
Contribute to AWS and Cloudflare infrastructure architecture and reliability
Key Responsibilities
Improve platform availability and operational resilience
Drive improvements around RTO, system performance, and operational recovery
Identify and remediate reliability bottlenecks across infrastructure and deployment systems
Strengthen observability, alerting quality, and operational tooling
Maintain and improve production Kubernetes infrastructure
Enhance deployment and CI/CD systems
Contribute to AWS and Cloudflare infrastructure architecture and reliability
Review infrastructure and operational changes across teams
Unblock engineers during incidents or complex operational work
Raise engineering standards around reliability, observability, and operational readiness
Technical Skills Required
AWS Kubernetes Datadog Terraform Go Postgres
Benefits & Perks
Fully remote
7+ years of experience in SRE, DevOps, or related infrastructure roles
Comfort using AI/agentic tooling as part of day-to-day engineering workflows
Nice to Have
Experience with Go
Postgres operational experience
Cloudflare expertise
Fintech or other regulated-environment experience

Job Description


Senior SRE / DevOps Engineer

A US based startup is confidentially hiring a fully remote senior-level SRE / DevOps engineer to help scale the reliability, performance, and operational maturity of a rapidly growing cloud platform.


This is a highly hands-on infrastructure role focused on improving system reliability, strengthening platform foundations, and enabling engineering teams to move faster safely. The position is not a traditional people-management or feature-delivery role.


This engineer will operate as a senior technical contributor who raises the operational bar across the organization through infrastructure improvements, technical reviews, observability work, and incident support.


What You'll Do

Reliability and Platform Engineering

  • Improve platform availability and operational resilience
  • Drive improvements around RTO, system performance, and operational recovery
  • Identify and remediate reliability bottlenecks across infrastructure and deployment systems
  • Strengthen observability, alerting quality, and operational tooling


Kubernetes & Cloud Infrastructure

  • Maintain and improve production Kubernetes infrastructure
  • Enhance deployment and CI/CD systems, including GitHub Actions and Argo CD workflows
  • Build and evolve reusable Terraform modules and infrastructure patterns
  • Contribute to AWS and Cloudflare infrastructure architecture and reliability


Engineering Enablement

  • Review infrastructure and operational changes across teams
  • Unblock engineers during incidents or complex operational work
  • Raise engineering standards around reliability, observability, and operational readiness
  • Partner with product engineering teams without becoming embedded in day-to-day feature pairing


On-Call Expectations

  • Engineers in this role may be pulled in to assist during complex incidents
  • Participation in a small platform-focused on-call rotation covering AWS, Cloudflare, Kubernetes, and shared infrastructure issues


Required Qualifications

  • 7+ years of experience in SRE, DevOps, Platform Engineering, or related infrastructure roles
  • Demonstrated engineering track record in fully remote, startup environments
  • Comfort using AI / agentic tooling as part of day-to-day engineering workflows, including tools such as Claude Code, Cursor, or similar
  • Currently operating at a Senior Engineer level (or equivalent)
  • 3+ years of deep, hands-on production experience with:
  • AWS
  • Kubernetes
  • Datadog
  • Terraform
  • Strong cloud architecture and infrastructure design skills
  • Solid understanding of security fundamentals and cloud security best practices
  • Deep observability, debugging, and system performance analysis experience


Nice to Have

  • Experience with Go
  • Postgres operational experience
  • Cloudflare expertise
  • Fintech or other regulated-environment experience

Similar Jobs

Explore other opportunities that match your interests

AI Cloud Infrastructure Engineer

Devops
•
11h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

omni studio

United State

Cloud Application Architect

Devops
•
1d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

NTT DATA North America

United State
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Associate

remotehunter

United State

Subscribe our newsletter

New Things Will Always Update Regularly