Cloud Infrastructure & Software Reliability Engineer

software integrators United Kingdom
Remote
Apply
AI Summary

We're hiring an Infrastructure & Site Reliability Engineer to work across software development and engineering teams, managing cloud infrastructure, supporting application teams, and resolving complex incidents.

Key Highlights
Manage cloud infrastructure across Azure and/or AWS, including provisioning, configuration, and cost management.
Troubleshoot networking issues across VPN, DNS, firewalls, and routing configurations.
Collaborate with distributed teams across the UK and Asia Pacific to improve platform observability and uptime.
Key Responsibilities
Manage cloud infrastructure across Azure and/or AWS, including provisioning, configuration, and cost management
Troubleshoot networking issues across VPN, DNS, firewalls, and routing configurations
Collaborate with distributed teams across the UK and Asia Pacific to improve platform observability and uptime
Support CI/CD pipelines across SI and Blink, ensuring reliable, fast deployments
Work closely with engineering teams to diagnose and resolve application-layer issues in production
Improve platform observability through logging, metrics, and alerting
Technical Skills Required
Cloud platforms (Azure and/or AWS, including networking, compute, and storage) Linux servers in a production environment PostgreSQL or other relational databases Networking fundamentals: routing, firewalls, DNS, and VPNs Scripting or automation in Bash, Python, or similar Infrastructure-as-code and automation tooling
Benefits & Perks
Fully remote, UK-based role with occasional in-person collaboration
Compensated on-call allowance with a fair, sustainable rotation
Meaningful scope across two products - SI and Blink - with a varied, hands-on remit
Competitive salary, DoE
Nice to Have
Experience with DevOps tooling: CI/CD pipelines, Docker, Kubernetes
Cybersecurity fundamentals - hardening, access control, vulnerability remediation
Exposure to monitoring and logging platforms such as Prometheus, ELK, or Grafana

Job Description


About the Role


We’re hiring an Infrastructure & Site Reliability Engineer to work across both a group, including Software Integrators (SI) and Blink. This is a hands-on technical role - not a service desk position - focused on keeping our applications, deployment pipelines, and cloud infrastructure reliable, secure, and scalable.


This is a pivotal position that not only will have scope to make a big impact on the team and customers, but this position also has the exciting opportunity to help build foundational UK operations from near zero.


You'll operate at the intersection of infrastructure, DevOps, and platform engineering: owning cloud environments, supporting application teams, improving observability, and resolving complex incidents end-to-end. The ideal candidate has a strong sysadmin or DevOps foundation and is comfortable working directly with client engineering teams.


The role includes participation in a compensated on-call rotation (weekend pager duty included), shared fairly across the team.


This role sits across both businesses, owning the infrastructure and platforms that keep them running.


Software Integrators

Software Integrators is a dedicated software development and engineering team, building, maintaining, and supporting the full technology stack for one of the UK's largest same-day delivery companies.


Blink

Blink is a transport management system built for the vehicle logistics industry — connecting car carriers, shippers, dealers, and drivers. Originally launched in Australia, Blink is now expanding into the UK and New Zealand


Key Responsibilities
Infrastructure & Cloud


  • Own and manage cloud infrastructure across Azure and/or AWS, including provisioning, configuration, and cost management.
  • Monitor, maintain, and improve Linux-based server environments and PostgreSQL databases.
  • Troubleshoot networking issues across VPN, DNS, firewalls, and routing configurations.
  • Contribute to infrastructure-as-code and automation tooling to reduce manual toil.


Application & Pipeline Support


  • Support CI/CD pipelines across SI and Blink, ensuring reliable, fast deployments.
  • Work closely with engineering teams to diagnose and resolve application-layer issues in production.
  • Improve platform observability through logging, metrics, and alerting (e.g. Prometheus, Grafana, ELK, Humio).
  • Participate in incident response: triage, root cause analysis, resolution, and post-incident documentation.


Reliability & Operations


  • Participate in a rotating on-call schedule with compensated weekend pager duty, shared equitably.
  • Maintain and expand runbooks, documentation, and internal tooling.
  • Collaborate with distributed teams across the UK and Asia Pacific.
  • Identify reliability risks proactively and lead improvements to uptime and performance.


Required Skills & Experience


  • Solid background in IT systems administration, infrastructure support, or technical operations (not end-user desktop support).
  • Strong hands-on experience with Linux servers in a production environment.
  • Experience with cloud platforms (Azure and/or AWS - including networking, compute, and storage).
  • Good understanding of networking fundamentals: routing, firewalls, DNS, and VPNs.
  • Experience with PostgreSQL or other relational databases.
  • Comfortable working independently and across distributed time zones.
  • Strong written and verbal communication skills - you'll engage directly with client engineering teams.


Bonus Skills (Nice to Have)


  • Experience with DevOps tooling: CI/CD pipelines, Docker, Kubernetes.
  • Cybersecurity fundamentals - hardening, access control, vulnerability remediation.
  • Exposure to monitoring and logging platforms such as Prometheus, ELK, or Grafana.
  • Familiarity with scripting or automation in Bash, Python, or similar.


What We Offer


  • A fully remote, UK-based role with occasional in-person collaboration.
  • An opportunity to help build the UK operations for the group from near zero
  • Meaningful scope across two products - SI and Blink - with a varied, hands-on remit.
  • Compensated on-call allowance with a fair, sustainable rotation.
  • A collaborative team culture that values autonomy, initiative, and technical excellence.
  • Competitive salary, DoE



Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Associate

Prism Digital

United Kingdom

Senior DevOps Engineer

Devops
1d ago
Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Mid-Senior level

brotherstech

United Kingdom

Vice President, DevOps

Devops
1d ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Eden Scott

United Kingdom

Subscribe our newsletter

New Things Will Always Update Regularly