SRE/Observability Engineer (6-12 months)

Insight Global • United State

Remote

This Job is No Longer Active This position is no longer accepting applications

AI Summary

Seeking a skilled SRE with strong observability expertise to drive reliability maturity across multi-team environments. Design, scale, and optimize Prometheus and Grafana environments. Apply SRE principles to improve application reliability.

Key Highlights

Design, scale, optimize, and manage Prometheus and Grafana environments.

Apply and evolve an SRE Maturity Model to help teams mature across observability, resilience, automation, and reliability.

Establish, implement, and maintain Service Level Objectives (SLOs) and error budgets across applications and services.

Technical Skills Required

Prometheus Grafana PromQL Dynatrace Kubernetes Cloud platforms (AWS/GCP/Azure) CI/CD pipelines

Benefits & Perks

6-12 months contract

Fully remote work

Job Description

Company: Brightspeed

Title: SRE / Observability Engineer

Term: 6 months – 12 months

Fully Remote

Overview

We are seeking a highly skilled Site Reliability Engineer (SRE) with strong observability expertise, proven communication skills, and the ability to drive reliability maturity across multi-team environments. This role is ideal for someone who can blend deep technical proficiency with strategic thinking and collaborative influence.

Key Responsibilities

Observability Engineering

• Design, scale, optimize, and manage Prometheus and Grafana environments.

• Write advanced PromQL queries, dashboards, visualizations, and metric-based calculations.

• Build out and maintain Grafana instances, supporting multi-team use cases.

• Leverage Dynatrace with strong proficiency in metrics and analytics to deliver efficient, actionable observability solutions for engineering and operations teams (e.g., dashboards, insights, reports).

• Analyze telemetry data to identify the metrics that matter (MTM), drive actionable insights, and influence engineering decisions.

Site Reliability Engineering

• Apply and evolve an SRE Maturity Model to help teams mature across observability, resilience, automation, and reliability.

• Establish, implement, and maintain Service Level Objectives (SLOs) and error budgets across applications and services.

• Partner effectively with engineering, product, operations, and leadership teams; translate complex technical insights into clear, actionable communication.

• Identify and reduce toil through automation, tooling improvements, and process refinement.

• Support incident analysis, reliability reviews, and continuous improvement initiatives.

Required Skills & Experience

• Familiarity with SRE principles, maturity models, and reliability roadmaps.

• Demonstrated experience improving application reliability via data-driven decisions.

• Hands-on experience with Prometheus, Grafana, and PromQL.

• Strong understanding of Dynatrace, metric analysis, and observability practices.

• Excellent communication skills and ability to collaborate across diverse technical and non-technical teams.

• Strong analytical and problem-solving skills with a bias for action.

Nice to Have

Experience with Kubernetes, cloud platforms (AWS/GCP/Azure), or CI/CD pipelines.

Experience with Automation

Experience with large-scale distributed systems or high-availability architectures.

Job Overview

Posted Date Dec 10, 2025

Employment Type Contract

Experience Level Associate

Location United State

Category Devops

Company Insight Global

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Cloud Security Engineer

Devops

•

10h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

TEKsystems

United State

DevOps Engineer - Site Reliability Engineering

Devops

•

1d ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

santa clara university leavey...

United State

Lead Infrastructure Engineer

Devops

•

1d ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

Bayview Asset Management, LLC

United State

SRE/Observability Engineer (6-12 months)

Key Highlights

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Cloud Security Engineer

Premium Job

TEKsystems

DevOps Engineer - Site Reliability Engineering

santa clara university leavey...

Lead Infrastructure Engineer

Premium Job

Bayview Asset Management, LLC

Subscribe our newsletter