Site Reliability Engineer

Jobgether • United State

Remote

Apply

AI Summary

Jobgether is seeking a Site Reliability Engineer to ensure the reliability, scalability, and performance of complex systems across cloud and on-premises environments. The ideal candidate will have experience in infrastructure engineering and operational best practices. This role involves hands-on management of large-scale data centers, automation of deployment workflows, and integration of observability tools.

Key Highlights

Design and maintain scalable infrastructure using containers, microservices, and Kubernetes

Monitor system performance and troubleshoot reliability issues

Manage CI/CD pipelines and GitOps workflows

Key Responsibilities

Design, implement, and maintain scalable infrastructure using containers, microservices, and Kubernetes

Monitor system performance and troubleshoot reliability issues

Manage CI/CD pipelines and GitOps workflows

Implement configuration management processes using tools like Ansible

Operate and optimize high-throughput Kafka clusters

Collaborate with development teams to influence system design and operational policies

Technical Skills Required

Linux systems expertise Containers Kubernetes CI/CD pipeline management GitOps workflows Prometheus Grafana ELK Stack Ansible Kafka ArgoCD Helm charts Kustomize

Benefits & Perks

Competitive salary: $118,000–$158,000 USD

Comprehensive medical, dental, and vision coverage

Employer-paid income protection benefits

Flexible spending accounts

Retirement plan with 401(k) and employer match

Employee Stock Purchase Plan (ESPP) and potential bonuses

Paid time off, sick leave, and company-observed holidays

Employee Assistance Program and additional perks

Job Description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Site Reliability Engineer in United States.

This role is responsible for ensuring the reliability, scalability, and performance of complex systems across cloud and on-premises environments. The Site Reliability Engineer will work closely with development, operations, and product teams to design and maintain resilient infrastructure, implement CI/CD pipelines, and manage containerized applications and Kubernetes clusters. You will proactively monitor system performance, troubleshoot critical issues, and optimize operational processes to maintain high service availability. This position involves hands-on management of large-scale data centers, automation of deployment workflows, and integration of observability tools. The ideal candidate is highly analytical, detail-oriented, and experienced in both infrastructure engineering and operational best practices. Success in this role directly impacts system uptime, operational efficiency, and overall customer satisfaction.

Accountabilities

Design, implement, and maintain scalable, highly available infrastructure using containers, microservices, and Kubernetes.
Monitor system performance, troubleshoot reliability issues, and ensure optimal operation of both cloud-based and on-premises systems.
Manage CI/CD pipelines and GitOps workflows, including ArgoCD, Helm charts, and Kustomize configurations for efficient software deployment.
Implement configuration management processes using tools like Ansible to ensure consistent environments across data centers.
Operate and optimize high-throughput Kafka clusters for event streaming, including replication, partitioning, and disaster recovery strategies.
Collaborate with development teams to influence system design, operational policies, and best practices.
Maintain comprehensive technical documentation, runbooks, architectural diagrams, and incident response procedures.
Participate in on-call rotations and conduct blameless post-mortems for critical incidents.
Continuously evaluate emerging technologies to enhance operational efficiency and reliability.

Requirements

Interested in remote work opportunities in Devops? Discover Devops Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.

Bachelor’s degree in Computer Science, Engineering, or a related field; advanced degree preferred.
5+ years of experience in site reliability engineering or a related field focused on production systems and service delivery.
Strong Linux systems expertise, including configuration, tuning, and troubleshooting.
Hands-on experience with containers, Kubernetes, and microservices architecture.
Proficient in CI/CD pipeline management and GitOps workflows, including ArgoCD, Helm charts, and automation tools.
Experience with observability tools such as Prometheus, Grafana, and ELK Stack.
Proven ability to manage large on-premises data centers with hundreds of bare metal servers and VMs.
Familiarity with networking concepts, protocols, and configuration management tools.
Strong analytical and troubleshooting skills with the ability to resolve complex system issues.
Excellent communication skills and experience collaborating across cross-functional teams.

Benefits

Competitive salary: $118,000–$158,000 USD, depending on experience and location.
Comprehensive medical, dental, and vision coverage for employees and dependents.

Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.

Employer-paid income protection benefits including life, AD&D, short- and long-term disability.
Flexible spending accounts for healthcare and dependent care.
Retirement plan with 401(k) and employer match, plus Roth options.
Employee Stock Purchase Plan (ESPP) and potential bonuses.
Paid time off, sick leave, and company-observed holidays.
Employee Assistance Program and additional perks such as commuter benefits, discount programs, and identity theft protection.
Fully remote work opportunity within the U.S.

Why Apply Through Jobgether?

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Job Overview

Posted Date Apr 16, 2026

Employment Type Full-time

Experience Level Mid-Senior level

Location United State

Annual Salary 118,000 - 158,000 USD

Category Devops

Company Jobgether

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Senior Solutions Architect, Member Data Platform

Devops

•

2h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Not Applicable

dataspring

United State

Senior Infrastructure Engineer - Cloud Stack Owner

Devops

•

2h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Not Applicable

searchapi

United State

Senior IT M&A Program Manager - Software Products

Devops

•

3h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

Brillio

United State

Site Reliability Engineer

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Senior Solutions Architect, Member Data Platform

dataspring

Senior Infrastructure Engineer - Cloud Stack Owner

searchapi

Senior IT M&A Program Manager - Software Products

Premium Job

Brillio

Subscribe our newsletter