Senior Site Reliability Engineer (Security Infrastructure) - Remote (India)

Jobgether India
Remote
This Job is No Longer Active This position is no longer accepting applications
AI Summary

Seeking an experienced Senior Site Reliability Engineer to ensure the reliability, scalability, and performance of critical security infrastructure in India. This fully remote role involves leading initiatives for operational excellence, defining SLIs/SLOs, driving automation, and incident response for platforms like intrusion detection and DDoS mitigation. You will influence architecture, observability, and operational practices at scale, with high ownership and mentorship opportunities.

Key Highlights
Lead initiatives for operational excellence across critical security platforms (intrusion detection, DDoS mitigation).
Define and own reliability outcomes, including SLIs/SLOs, error budgets, alerting, and dashboards.
Architect and implement high availability, capacity planning, and disaster recovery strategies.
Automate deployments, configuration, and compliance using IaC and scripting.
Lead incident response in a 24/7 on-call rotation and drive blameless postmortems.
Mentor and guide junior engineers and contractors.
Work in a fully remote environment with occasional team interactions.
Technical Skills Required
SaltStack Python Infrastructure as Code Linux TCP/IP Routing Load Balancing (L4-L7) Network Security Icinga Grafana InfluxDB rsyslog Git CI/CD IDS/IPS DDoS Mitigation HAProxy Nginx Juniper Palo Alto
Benefits & Perks
Competitive salary with performance incentives
Comprehensive health and family-friendly benefits
Parental leave
Flexible remote working arrangements
Retirement savings and equity opportunities
Paid time off and bonus eligibility
Professional development support and mentorship
Collaborative and inclusive team culture

Job Description


This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in India.

We are seeking an experienced Senior Site Reliability Engineer to ensure the reliability, scalability, and performance of critical security infrastructure. In this role, you will lead initiatives for operational excellence across mission-critical platforms such as intrusion detection, DDoS mitigation, and other security services. You will define service-level objectives, drive automation, and lead incident response, all while collaborating with cross-functional teams to deliver secure and high-performing systems. This is an opportunity to influence architecture, observability, and operational practices at scale, working in a fully remote environment with occasional team interactions. The role offers high ownership and the chance to mentor and guide other engineers in a dynamic, fast-paced setting.

Accountabilities:

  • Own reliability outcomes for security platforms, defining SLIs/SLOs, error budgets, alerting, dashboards, and runbooks
  • Architect and implement high availability, capacity planning, and disaster recovery for IDS/IPS, DDoS mitigation, and supporting services
  • Design zero/minimal-downtime maintenance and upgrade strategies for OS, firmware, and signature updates
  • Automate deployments, configuration, and compliance using SaltStack, Python, and Infrastructure as Code practices
  • Operate and optimize a heterogeneous stack including IPS, DDoS, HAProxy, Nginx, Juniper, and Palo Alto systems
  • Lead incident response in a 24/7 on-call rotation, act as incident commander, and drive blameless postmortems with durable fixes
  • Reduce operational toil through self-service tooling, automated health checks, and reliability reviews including game days and chaos testing
  • Maintain audit-ready operations aligned with compliance standards; create and update SOPs, operational documentation, and architectural diagrams
  • Mentor and provide technical guidance to contractors and junior engineers while collaborating with cross-functional teams


Requirements

  • 5+ years of experience in site reliability, production operations, or platform engineering supporting large-scale, mission-critical systems
  • Expert-level knowledge of SaltStack (or similar configuration management tools) for automation and deployments
  • Strong Linux administration skills with deep understanding of TCP/IP, routing, load balancing (L4-L7), and network security concepts
  • Proficiency in Python for automation, integrations, and operational tooling
  • Experience with observability tools: Icinga, Grafana, InfluxDB, and rsyslog pipelines
  • Familiarity with Git-based workflows, CI/CD pipelines, and Infrastructure as Code concepts
  • Proven effectiveness in 24/7 operations environments with on-call responsibilities and incident management experience
  • Excellent technical writing, documentation, and mentoring skills
  • Preferred: hands-on experience with IDS/IPS and DDoS platforms (TrendMicro TippingPoint, Suricata, NetScout/Arbor), HAProxy/Nginx administration, and network devices (Juniper, Palo Alto)
  • Bachelor's degree in Computer Science, Information Technology, or related field; industry certifications such as Security+, CISSP, Linux+ are a plus


Benefits

  • Competitive salary with performance incentives
  • Comprehensive health and family-friendly benefits, including parental leave
  • Flexible remote working arrangements
  • Retirement savings and equity opportunities (varies by role)
  • Paid time off and bonus eligibility
  • Professional development support and mentorship
  • Collaborative and inclusive team culture embracing diversity and entrepreneurship

Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.

When you apply, your profile goes through our AI-powered screening process designed to identify top talent efficiently and fairly.

🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.

📊 It compares your profile to the job's core requirements and past success factors to determine your match score.

🎯 Based on this analysis, we automatically shortlist the 3 candidates with the highest match to the role.

🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.

The process is transparent, skills-based, and free of bias — focusing solely on your fit for the role. Once the shortlist is completed, we share it directly with the company that owns the job opening. The final decision and next steps (such as interviews or additional assessments) are then made by their internal hiring team.

Thank you for your interest!


Subscribe our newsletter

New Things Will Always Update Regularly