Senior DevOps Engineer

fal • United State

Relocation

Apply

AI Summary

Design and implement custom compute environments for customers, leveraging AI and automation to optimize performance and efficiency. Build and maintain Linux images, configure dedicated Kubernetes clusters, and set up network monitoring and diagnostics. Collaborate with cross-functional teams to drive technical decisions and improve processes.

Key Highlights

Custom compute environments for customers

AI and automation optimization

Linux image building and maintenance

Kubernetes cluster configuration

Network monitoring and diagnostics

Key Responsibilities

Build and deliver custom environments with excellent GPU performance for customer workloads

Leverage AI to an extreme level to automate provisioning, alerting and recovery

Provision and configure dedicated Kubernetes clusters tailored to customer requirements

Design and implement overlay networking (VLAN, VXLAN) and routing configurations (ECMP, BGP) and tunnels (strongSwan, IPSEC) for tenant isolation and performance

Build and maintain Linux images

Set up network monitoring and diagnostics for customer environments

Automate the end-to-end lifecycle of customer compute environments: creation, configuration, validation, and teardown

Technical Skills Required

Linux virtualization KVM/QEMU libvirt VFIO device passthrough hugepages NUMA CPU pinning VXLAN VLAN ECMP BGP ARP Python Bash Ansible Terraform NVIDIA GPUs drivers MIG container runtimes InfiniBand RDMA/RoCEv2 GPUDirect

Benefits & Perks

$180,000-250,000 plus equity

health, dental, and vision insurance

relocation assistance

regular team events and offsites

Nice to Have

Experience with SR-IOV, DPDK, or other high-performance networking technologies

Experience with shared network storage (Ceph, Lustre, Weka)

Experience with network automation tools (Netbox, Nautobot, Nornir)

Job Description

fal is the generative media ecosystem powering the next generation of AI products. We build the infrastructure, tools, and model access that teams need to move from idea to production, and do it at scale without compromise. For developers and enterprises, fal is the foundation that makes generative media not just possible, but practical: a unified platform where high-performance inference, orchestration, and observability come together to unlock new categories of AI-native products.

As generative media reshapes industries across a market projected to grow by hundreds of billions over the next decade, fal is becoming the ecosystem that ambitious teams build on.

About This Role

You build the custom compute environments we deliver to customers — bare metal or virtual machines with GPU passthrough, dedicated Kubernetes clusters, and the networking that ties them together. You work across the full stack from Linux image building to overlay network design to cluster bootstrapping.

Key Responsibilities

Build and deliver custom environments with excellent GPU performance for customer workloads
Leverage AI to an extreme level to automate provisioning, alerting and recovery
Provision and configure dedicated Kubernetes clusters tailored to customer requirements
Design and implement overlay networking (VLAN, VXLAN) and routing configurations (ECMP, BGP) and tunnels (strongSwan, IPSEC) for tenant isolation and performance
Build and maintain Linux images
Set up network monitoring and diagnostics for customer environments
Automate the end-to-end lifecycle of customer compute environments: creation, configuration, validation, and teardown

Looking to advance your Development & Programming career with relocation support? Explore Development & Programming Jobs with Relocation Packages that include comprehensive packages to help you move and settle in your new role.

Requirements

5+ years experience with Linux virtualization: KVM/QEMU, libvirt, VFIO device passthrough, hugepages, NUMA, CPU pinning
Strong networking fundamentals: VXLAN, VLAN, ECMP, BGP, ARP, and the ability to debug packet-level issues (tcpdump, Wireshark)
Production experience building and operating Kubernetes clusters on bare metal (MetalLB)
Proficiency with Linux image building and OS provisioning (kickstart, cloud-init, PXE/iPXE)
Proficiency in Python, Bash, Ansible and Terraform
Deep experience with NVIDIA GPUs: drivers, MIG, container runtimes (nvidia-container-toolkit), InfiniBand, RDMA/RoCEv2 and GPUDirect for high-performance AI networking
Excellent communication and ability to drive technical decisions across teams
Self-starter who executes quickly, takes ownership, and constantly seeks improvement

Discover our full range of relocation jobs with comprehensive support packages to help you relocate and settle in your new location.

Nice to have

Experience with SR-IOV, DPDK, or other high-performance networking technologies
Experience with shared network storage (Ceph, Lustre, Weka)
Experience with network automation tools (Netbox, Nautobot, Nornir)

Compensation

$180,000-250,000 plus equity + benefits (This range encompasses 2 levels Senior and Staff)

Interested in relocating to United State? Check out our comprehensive Relocation Jobs in United State page with detailed relocation packages and benefits.

Location

San Francisco, CA

What We Offer At Fal

Interesting and challenging work
A lot of learning and growth opportunities
We are currently hiring in downtown San Francisco.
We offer relocation assistance to San Francisco.
Health, dental, and vision insurance (US)
Regular team events and offsites

Job Overview

Posted Date May 11, 2026

Employment Type Full-time

Experience Level Mid-Senior level

Location United State

Annual Salary 180,000 - 250,000 USD

Category Programming

Company fal

Mentioned Skills

Industries

Similar Jobs

Explore other opportunities that match your interests

Senior Embedded Software Engineer

Programming

•

2h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

Actalent

United State

Senior AI Engineer for Mainframe Modernization

Programming

•

2h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

hypercubic

United State

Director of Technical Consulting

Programming

•

10h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

Ramp

United State

Senior DevOps Engineer

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Nice to Have

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Senior Embedded Software Engineer

Premium Job

Actalent

Senior AI Engineer for Mainframe Modernization

hypercubic

Director of Technical Consulting

Premium Job

Ramp

Subscribe our newsletter