Senior DevOps Engineer

4ir solutions United State
Remote
Apply
AI Summary

Deliver mission-critical IT/OT infrastructure for industrial customers. Design complex architectures, deploy and manage Kubernetes-based infrastructure, and participate in incident response. Work with a small team to build, create, and shape how we operate.

Key Highlights
Design complex IT/OT architectures
Deploy and manage Kubernetes-based infrastructure
Participate in incident response
Key Responsibilities
Design complex IT/OT architectures
Work directly with customers to understand their environment and estimate effort
Deploy and manage Kubernetes-based infrastructure and stateful applications across diverse customer environments
Participate in on-call rotation alongside the rest of the team
Own incidents through resolution, then drive root cause analysis that eliminates the class of problem—not just the symptom
Technical Skills Required
Kubernetes GitOps Infrastructure-as-Code Terraform Pulumi Crossplane Prometheus Loki Grafana
Benefits & Perks
Comprehensive benefits
Fully remote
401K
Nice to Have
Azure experience
SUSE ecosystem experience
Industrial, manufacturing, or OT environment experience
Familiarity with Inductive Automation's Ignition platform and MQTT

Job Description


About This Role

We deliver mission-critical IT/OT infrastructure—in cloud and on-prem—for industrial customers that can't afford downtime.

Small team. Hard problems. Practical solutions. No bureaucracy. No blame. No egos.

We ship it, own it, and make it better—blameless but accountable, shoulder to shoulder. We work hard. We stay human. We trust each other. We figure it out.

If you know what to do, delight in building it, and feel the ownership to support it—keep reading.

What You'll Do

Customer Delivery

  • Design complex IT/OT architectures—in cloud and on-prem—that are secure, recoverable, and sized appropriately
  • Work directly with customers to understand their environment and estimate effort
  • Own customer solutions end-to-end: requirements design build support
  • Build or use reusable modules when it makes sense—build bespoke when it doesn't
  • Deploy and manage Kubernetes-based infrastructure and stateful applications across diverse customer environments


Incident Response & Ownership

  • Participate in on-call rotation alongside the rest of the team—everyone here supports what we ship
  • Own incidents through resolution, then drive root cause analysis that eliminates the class of problem—not just the symptom
  • Build the runbooks, alerts, and automation that make the next incident less likely or less painful


Infrastructure & Automation

  • Work with Infrastructure-as-Code tools to provision and manage diverse customer environments
  • Implement and maintain GitOps workflows for in-cluster deployments
  • Ensure all infrastructure and application changes are declarative and version-controlled
  • Automate self-healing and system updates—reduce manual intervention and keep environments current


Observability & Reliability

  • Build and maintain monitoring, alerting, and dashboards using Prometheus, Loki, and Grafana
  • Define SLIs and SLOs that reflect what actually matters to customers
  • Surface real problems, reduce noise, and continually improve reliability and team efficiency


Shape the Future

  • We don't have everything figured out. You'll help build, create, and shape how we operate
  • Contribute to standards, patterns, and processes that make us better—not bureaucracy for its own sake
  • Bring the SRE mindset: automate toil, prefer boring/stable systems, and relentlessly improve


What We're Looking For

  • 5+ years in SRE, DevOps, or Infrastructure Engineering
  • Strong Kubernetes skills in production environments—you'll troubleshoot real clusters, not just tutorials
  • Experience with GitOps tooling (ArgoCD, Rancher Fleet, FluxCD, or similar)
  • Solid understanding of Infrastructure-as-Code concepts (Terraform, Pulumi, Crossplane, or similar)
  • Real incident response experience—you've been on-call, stayed calm, and fixed things under pressure
  • Comfort with heterogeneous environments—every customer site is a little different and you need to adapt
  • Clear communication skills—you can write a useful runbook, gather requirements on a customer call, and document what you learned
  • Ability to operate in ambiguity—we're building clarity, not waiting for it


Strong Plus

  • Azure experience (our primary cloud)
  • Experience with SUSE ecosystem (SLE Micro, RKE2, Rancher, Longhorn)
  • Industrial, manufacturing, or OT environment experience
  • Familiarity with Inductive Automation's Ignition platform and MQTT
  • Experience in a startup or small-team environment where you wore many hats


The SRE Mindset

This matters here. We need someone who:

  • Sees repetitive manual work as a problem to automate, not a fact of life
  • Prefers stable, predictable, "boring" production over clever and fragile
  • Supports what they create—no throwing things over the wall
  • Treats incidents as opportunities for systemic improvement
  • Works well on a small team where everyone carries weight
  • Stays current with SRE practices, emerging technologies, and cloud/edge trends


A Few Honest Words

This is a startup. Hours can be demanding. Priorities shift. You won't have a team of 30 backing you up.

What you will have: the autonomy to make real decisions, teammates who own their work, and customers who genuinely depend on what we build. We work hard because the work matters—and we have fun doing it.

If you want a structured 9-5, predictability, and a clear ladder—this probably isn't the right fit.

If you want to build, learn, and be part of something that's actually going somewhere—let's talk.

What We Offer

  • Comprehensive benefits (Medical, Dental, Vision, 401K)
  • Fully remote—work from anywhere in the world
  • A team where it's safe to be honest, learn from mistakes, and get better together

Similar Jobs

Explore other opportunities that match your interests

Site Reliability Engineer

Devops
1h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Bright Vision Technologies

United State

DevOps Engineer

Devops
6h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

foresight diagnostics (a nater...

United State
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Bright Vision Technologies

United State

Subscribe our newsletter

New Things Will Always Update Regularly