Lead Site Reliability Engineer (SRE) - LATAM Remote

blue coding United State
Remote
This Job is No Longer Active This position is no longer accepting applications
AI Summary

Blue Coding is seeking an experienced Lead SRE to establish and define the SRE function for a US-based life insurance client. This is a foundational role to transform their NOC into a modern, automation-driven reliability team. You will be instrumental in shaping processes, selecting tools, and mentoring engineers to build a high-performing reliability function from the ground up.

Key Highlights
Opportunity to be the first SRE hire and define the reliability function.
Transition a traditional NOC to a modern, hybrid SRE approach.
Focus on building foundational SRE practices, automation, and observability.
Role requires strong collaboration, mentorship, and change agent capabilities.
Position is exclusively open to candidates based in LATAM countries.
Technical Skills Required
SRE DevOps Systems Engineering AWS Grafana CloudWatch Datadog Terraform Infrastructure-as-Code Chef Puppet Ansible Python AWS Lambda Ruby Secrets Management (AWS Secrets Manager, HashiCorp Vault, Keeper, Infisical) Atlassian Suite (Jira, Confluence, Bitbucket) CI/CD .NET
Benefits & Perks
Salary in USD
Full-time
100% Remote

Job Description


Why Blue Coding?

At Blue Coding, we specialize in hiring excellent developers and amazing people from all over Latin America and other parts of the world. For the past 11 years, we’ve helped cutting-edge companies in the United States and Canada build great development teams and develop great products. Large multinationals, digital agencies, Saas providers, and software consulting firms are just a few of our clients. Our team of over 150 engineers, project managers, QA, UX/UI designers, and many more is distributed in more than 10 countries across the Americas. We are a fully remote company working with a wide array of technologies, and we have expertise in every stage of the software development process.

Our team is highly connected, united, and culturally diverse, and our collaborators are involved in many initiatives around the world, from wildlife preservation to volunteering at local charities. We stand for honesty, fairness, respect, efficiency, hard work, and cooperation.

This position is open exclusively to candidates based in LATAM countries.

What are we looking for?

In this opportunity, we are looking for an experienced Site Reliability Engineer to work with one of our foreign clients, a corporation that, through its subsidiaries, provides life insurance protection targeted to the middle American market. They're transforming how technology powers the life insurance experience. To support their ambitious growth goals, they’re evolving from a traditional Network Operations Center (NOC) model to a modern, hybrid Site Reliability Engineering (SRE) approach.

This is a rare opportunity to be the very first SRE hire—you won’t just support reliability, you’ll define it. From shaping processes and selecting tools to mentoring engineers and building automation, you’ll help them create a high-performing reliability function from the ground up.

What's unique about this job?

Through innovation in product design and distribution that provides access to the middle market, including call center and web-enabled sales and underwriting processes, quick issuance of policies, and an emphasis on products not medically underwritten at the time of sale, the company seeks to make life insurance more affordable for the middle market.

Here are some of the exciting day-to-day challenges you will face in this role:


  • Build the foundation: Design and implement SRE best practices, processes, and tooling
  • Lead operational transformation: Help transition their NOC into a technically empowered, automation-driven reliability team
  • Own observability and monitoring: Drive improvements in system monitoring, alerting, and dashboards using tools like Grafana, CloudWatch, and Datadog
  • Automate everything: Reduce manual effort and increase resilience through Terraform, scripting, and cloud-native automation
  • Define and measure reliability: Establish SLIs, SLOs, and error budgets that keep the team accountable to high uptime and stability goals
  • Collaborate and mentor: Work closely with DevOps, SysOps, and engineering teams while helping upskill existing NOC engineers
  • Be a change agent: Bring a forward-looking mindset, driving cultural and technical change across the organization



You will shine if you have:


  • 5+ years in SRE, DevOps, or advanced systems engineering roles
  • Proven experience building or transforming SRE practices—you know what it takes to stand up a new function
  • Experience in creating and managing, reporting, and analyzing stability metrics
  • Strong AWS expertise
  • Strong experience with Secrets Management tooling like AWS Secrets Manager, HashiCorp Vault, Keeper, or Infisical strongly desired
  • Experience in the Atlassian tool platform (Jira, Confluence, Bitbucket) strongly desired
  • Hands-on experience with Terraform and infrastructure-as-code. This should include tools like Chef, Puppet, or Ansible
  • Strong programming proficiency—with emphasis on scripting and serverless automation (e.g., AWS Lambda). Experience developing tools and integrations using Python or equivalent modern languages is essential
  • Capacity to interact with development teams to understand automation and monitoring needs
  • Proficiency in Python and/or Ruby for automation and integrations
  • Expertise in monitoring, observability, incident response, and service reliability
  • Ability to define observability, incident response, and SLIs/SLOs
  • Excellent collaborator with a passion for mentorship and team growth



It doesn’t hurt if you also have:


  • AWS certifications are highly preferred
  • Experience with Ruby or .NET is a plus, supporting interoperability and legacy service integrations
  • Experience in insurance, fintech, or other regulated industries
  • Familiarity with incident.io, Jira Service Manager, or similar ITSM tools
  • Background in CI/CD pipelines and modern DevOps practices



Here are some of the perks we offer you:


  • Salary in USD
  • Full-time
  • 100% Remote



Ready to learn more? Apply below!

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Subscribe our newsletter

New Things Will Always Update Regularly