Job Description
Network Monitoring & Observability Engineer (m/f)
Start: ASAP
Duration: 6 months
Location: fully remote
We are seeking a highly skilled and experienced Monitoring and Observability Engineer to join our team. This role involves designing, implementing, and managing comprehensive monitoring solutions using Prometheus, Grafana, SNMP-Exporter, Streaming Telemetry, OpenTelemetry, and other related technologies. The ideal candidate will have a strong background in Time-series databases, network monitoring, and dashboard development, with a focus on ensuring the reliability and performance of our infrastructure and applications.
Responsibilities
Start: ASAP
Duration: 6 months
Location: fully remote
We are seeking a highly skilled and experienced Monitoring and Observability Engineer to join our team. This role involves designing, implementing, and managing comprehensive monitoring solutions using Prometheus, Grafana, SNMP-Exporter, Streaming Telemetry, OpenTelemetry, and other related technologies. The ideal candidate will have a strong background in Time-series databases, network monitoring, and dashboard development, with a focus on ensuring the reliability and performance of our infrastructure and applications.
Responsibilities
- Design, implement, and manage Prometheus-based monitoring solutions, including configurations and alert rules.
- Develop and maintain interactive and visually appealing Grafana dashboards.
- Configure SNMP modules/jobs to scrape SNMP metrics for different network technologies in a very optimized way.
- Strong knowledge in Git to be able to clone working branches, develop and commit into the main branch. Or other approaches but show strong hold on Git usage.
- Identify and onboard new metrics from various systems and applications, developing data pipelines for metrics collection and storage.
- Optimize and scale monitoring environments to handle large volumes of metrics and ensure comprehensive monitoring coverage.
- Familiarity with network monitoring tools and practices.
- Extensive experience with Prometheus and related technologies (Alertmanager, Pushgateway, etc.).
- Strong knowledge of time-series databases and monitoring concepts.
- Proficiency in writing Prometheus queries (PromQL).
- Strong experience with Grafana and its ecosystem.
- Proficiency in creating and managing Grafana dashboards and panels.
- Knowledge of data visualization principles and best practices.
- Familiarity with monitoring and observability tools and practices.
- Strong knowledge of SNMP protocols and network device management.
- Experience with SNMP-Exporter and its integration with Prometheus.
- Strong in SNMP modules creations and scrape configs for various network technologies.
- Strong Git experience.
- Strong understanding of metrics and monitoring concepts.
- Experience with metrics collection tools (Prometheus, Telegraf, Collectd, etc.).
- Experience with Streaming Telemetry solutions for Real Time monitoring.
- Experience with OpenTelemetry for tracing and observability.
- Familiarity with Linux/Unix systems and Scripting languages (Bash, Python).
- Experience with containerization and orchestration tools (Docker, Kubernetes).