AI Summary
Lead the design, implementation, and migration of Dynatrace. Deploy, configure, and manage Dynatrace for full-stack observability. Partner with Development, Operations, and SRE teams to improve system reliability and service health.
Key Highlights
Lead the design, implementation, and migration of Dynatrace
Deploy, configure, and manage Dynatrace for full-stack observability
Partner with Development, Operations, and SRE teams
Technical Skills Required
Benefits & Perks
100% remote contract with global exposure
High-impact role influencing platform stability and performance
Job Description
Key Responsibilities
- Lead the design, implementation, and migration from AppDynamics and Splunk to Dynatrace.
- Deploy, configure, and manage Dynatrace for full-stack observability across applications, infrastructure, and cloud environments.
- Build and maintain dashboards, alerts, DQL queries, SLOs, health rules, and anomaly detection models.
- Instrument applications, microservices, operating systems, and cloud platforms with deep hands-on involvement.
- Analyze metrics, logs, traces, and events to support incident detection, root cause analysis (RCA), and performance optimization.
- Partner with Development, Operations, and SRE teams to improve system reliability and service health.
- Automate observability processes and enforce platform best practices and governance.
- Support production issues and provide expert guidance during incidents and post-mortems.
- 8+ years of experience in IT operations, SRE, or observability engineering roles.
- Strong hands-on expertise in Dynatrace administration, configuration, automation, and platform design.
- Proven experience designing and implementing observability using:
- Dynatrace
- Splunk
- AppDynamics
- Strong experience with telemetry (metrics, logs, traces, events) and observability best practices.
- Solid operating system knowledge (Linux/Unix, Windows) with strong troubleshooting skills.
- Experience instrumenting:
- Applications and microservices
- Containers and Kubernetes (preferred)
- Cloud platforms (AWS, Azure, or GCP preferred)
- Strong analytical skills for performance tuning and problem resolution.
- Excellent communication skills and ability to collaborate across engineering teams.
- Experience with SRE practices (SLIs, SLOs, error budgets).
- Experience with Infrastructure as Code (Terraform, Ansible, etc.).
- Exposure to CI/CD and DevOps pipelines.
- Cloud-native and microservices architecture experience.
- Opportunity to lead a large-scale enterprise observability transformation.
- Work with modern reliability and monitoring platforms.
- 100% remote contract with global exposure.
- High-impact role influencing platform stability and performance.