Observability Engineer

Jobs via Dice United State
Relocation
Apply
AI Summary

Design, build, and enhance end-to-end monitoring, logging, and telemetry solutions. Develop observability dashboards, improve system reliability, and ensure proactive detection of performance issues across cloud and data platforms.

Key Highlights
Implement and maintain observability frameworks, dashboards, and alerting systems.
Build and optimize telemetry pipelines for real-time monitoring and insights.
Collaborate with SRE, DevOps, and Platform teams to improve reliability and performance.
Support observability for Databricks and/or Snowflake environments.
Enable observability best practices across applications and cloud infrastructure.
Develop and manage infrastructure using AWS and Terraform.
Configure, manage, and optimize Grafana and Dynatrace for monitoring and analytics.
Troubleshoot incidents, conduct root-cause analysis, and drive continuous improvements.
Technical Skills Required
AWS Terraform Grafana Dynatrace Databricks Snowflake SRE principles Reliability practices
Benefits & Perks
Relocation allowed
High possibility of extension

Job Description


Dice is the leading career destination for tech experts at every stage of their careers. Our client, DRC Systems USA LLC, is seeking the following. Apply via Dice today!

Job Title: Observability Engineer

Location: Jersey City, NJ - Onsite (Relocation allowed)

Duration: 6+ Months (High Possibility of Extension)

Job Description:

We are looking for an experienced Observability Engineer with strong SRE principles to design, build, and enhance end-to-end monitoring, logging, and telemetry solutions. The ideal candidate will be responsible for developing observability dashboards, improving system reliability, and ensuring proactive detection of performance issues across cloud and data platforms.

Key Responsibilities:

  • Implement and maintain observability frameworks, dashboards, and alerting systems.
  • Build and optimize telemetry pipelines for real-time monitoring and insights.
  • Collaborate with SRE, DevOps, and Platform teams to improve reliability and performance.
  • Support observability for Databricks and/or Snowflake environments.
  • Enable observability best practices across applications and cloud infrastructure.
  • Develop and manage infrastructure using AWS and Terraform.
  • Configure, manage, and optimize Grafana and Dynatrace for monitoring and analytics.
  • Troubleshoot incidents, conduct root-cause analysis, and drive continuous improvements.

Required Skills:

  • Strong understanding of SRE principles and reliability practices.
  • Experience building dashboards and observability workflows.
  • Fundamental knowledge of Databricks or Snowflake.
  • Hands-on experience with AWS cloud engineering.
  • Proficiency in Terraform for IaC.
  • Expertise with Grafana and Dynatrace.

Subscribe our newsletter

New Things Will Always Update Regularly