We are seeking an experienced Senior SRE Tech Lead to guide our team in building and maintaining robust infrastructure across public and private cloud environments. This role will drive the adoption of SRE best practices, improve system performance, and ensure service continuity.
Key Highlights
Technical Skills Required
Benefits & Perks
Job Description
Salary: Up to 70 lahks INR
Overview
We are a rapidly growing technology organization focused on delivering innovative, high-quality platforms that enable seamless integration and scalability. Our mission is to build reliable, efficient, and secure systems that support millions of users and transactions daily.
Why This Role Matters
As our services expand, ensuring reliability, scalability, and operational excellence becomes critical. We are seeking an experienced SRE Tech Lead to guide our team in building and maintaining robust infrastructure across public and private cloud environments. This role will drive the adoption of SRE best practices, improve system performance, and ensure service continuity.
Key Responsibilities
- SRE Strategy & Roadmap: Define and execute strategies to enhance reliability, performance, and scalability.
- Observability Leadership: Oversee monitoring, alerting, logging, and tracing systems to ensure optimal observability.
- Service Quality: Establish and maintain SLOs/SLAs, manage error budgets, and lead improvement initiatives.
- Performance Optimization: Identify and resolve bottlenecks in latency and throughput.
- Incident Management: Act as incident commander during outages, lead RCA efforts, and implement preventive measures.
- Automation & Efficiency: Drive automation of operational tasks to reduce toil and improve scalability.
- Team Leadership: Mentor and guide SRE team members, fostering technical growth and collaboration.
- Cross-functional Collaboration: Partner with development, infrastructure, and security teams to promote a DevOps culture.
Mandatory Qualifications
- 5+ years of experience in SRE or infrastructure engineering, with at least 2 years in a leadership role.
- Proven experience managing production systems in public or private cloud environments (AWS, GCP, Azure, etc.).
- Expertise in designing and operating Kubernetes clusters at scale.
- Strong knowledge of monitoring and logging tools (Prometheus, Grafana, ELK, Datadog).
- Deep understanding of UNIX-like systems and networking fundamentals (TCP/IP, HTTP).
- Hands-on experience with CI/CD pipelines (Jenkins, GitLab CI/CD, CircleCI).
- Proficiency in scripting languages (Shell, Python) for automation.
- Excellent communication and collaboration skills.
Preferred Qualifications
- Background in web application development.
- Experience with test automation or as a Software Engineer in Test (SET).
- Practical experience with observability metrics and error budget management.
- Track record of reducing operational toil through automation.
- Experience working with globally distributed teams.
Location
Relocation required.