MLOps Engineer

Jobgether • United State

Remote

Apply

AI Summary

This role is focused on building and operating high-performance machine learning inference platforms. The MLOps Engineer will ensure that model serving infrastructure is reliable, scalable, and optimized for latency, throughput, and cost efficiency. The ideal candidate will have 6+ years of experience in distributed systems, infrastructure engineering, or ML platform development.

Key Highlights

Design, build, and operate scalable model serving platforms

Optimize inference performance using techniques such as batching and caching

Implement multi-tenant serving architectures with rate limiting and traffic management controls

Key Responsibilities

Design, build, and operate scalable model serving platforms for LLMs, vision models, and recommendation systems

Optimize inference performance using techniques such as batching, caching, speculative decoding, and request routing strategies

Implement multi-tenant serving architectures with rate limiting, QoS policies, and traffic management controls

Technical Skills Required

Python Distributed systems Machine learning

Benefits & Perks

Competitive salary range of $100,000 - $150,000 annually

100% remote position within the United States

Full-time W2 employment with long-term stability

Nice to Have

Experience with AI model serving at scale, multi-region systems, or FinOps optimization

Job Description

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a MLOps Engineer based in the United States.

This role is focused on building and operating high-performance machine learning inference platforms that support large-scale, production AI systems.

You will be responsible for ensuring that model serving infrastructure is reliable, scalable, and optimized for latency, throughput, and cost efficiency.

The position sits at the intersection of machine learning and distributed systems engineering, with a strong emphasis on production-grade performance.

You will design systems that handle complex workloads such as LLMs, vision models, and recommendation engines across cloud-native environments.

The environment is highly technical, fast-moving, and deeply focused on engineering excellence and observability.

You will work closely with ML researchers, product teams, and infrastructure engineers to bring cutting-edge models into production.

This is a hands-on role where your work directly impacts AI product performance, scalability, and user experience at scale.

Accountabilities

Design, build, and operate scalable model serving platforms for LLMs, vision models, and recommendation systems.
Optimize inference performance using techniques such as batching, caching, speculative decoding, and request routing strategies.
Implement multi-tenant serving architectures with rate limiting, QoS policies, and traffic management controls.
Develop autoscaling and capacity planning systems to balance latency, cost, and throughput across workloads.
Improve GPU utilization and memory efficiency for high-performance inference workloads.
Integrate model serving systems with APIs, identity services, and observability platforms.
Build and enhance observability frameworks covering latency, GPU metrics, error tracking, and system health.
Support deployment pipelines including canary releases, shadow testing, and rollback mechanisms.
Participate in incident response for production AI services and drive long-term reliability improvements.

Interested in remote work opportunities in Development & Programming? Discover Development & Programming Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.

Collaborate with ML and product teams to support model releases and production rollouts.
Implement security and abuse prevention controls at the serving layer.
Document system behavior, operational procedures, and performance tuning best practices.

Requirements

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
6+ years of experience in distributed systems, infrastructure engineering, or ML platform development.
Strong proficiency in Python and a systems programming language such as Go, Rust, or C++.
Experience building and operating high-throughput, low-latency production systems.
Hands-on experience with LLM inference frameworks such as vLLM, TensorRT-LLM, or similar.
Strong understanding of GPU architecture, memory management, and performance optimization.
Experience with Kubernetes, cloud platforms, and autoscaling infrastructure.
Strong knowledge of observability tools including metrics, logging, and distributed tracing systems.
Solid understanding of performance engineering, capacity planning, and distributed system design.
Strong communication and incident response skills in production environments.
Experience with AI model serving at scale, multi-region systems, or FinOps optimization is a plus.

Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.

Benefits

Competitive salary range of $100,000 - $150,000 annually.
100% remote position within the United States.
Full-time W2 employment with long-term stability.
Opportunity to work on cutting-edge AI inference and LLM serving systems.
Exposure to advanced GPU optimization and large-scale distributed AI infrastructure.
Career growth through ownership of production AI platforms and architecture decisions.
Inclusive and equal opportunity workplace culture.

How Jobgether Works

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses and identifying potential inconsistencies or verification signals in application materials based on available information. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Job Overview

Posted Date Jul 03, 2026

Employment Type Full-time

Experience Level Not Applicable

Location United State

Category Programming

Company Jobgether

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Senior Embedded Semiconductor Engineer - Software Tooling & Infrastructure

Programming

•

17m ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

Bright Vision Technologies

United State

Demand Generation Manager

Programming

•

18m ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

Jobgether

United State

PLM Engineer (Windchill / Teamcenter)

Programming

•

32m ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

Bright Vision Technologies

United State

MLOps Engineer

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Nice to Have

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Senior Embedded Semiconductor Engineer - Software Tooling & Infrastructure

Bright Vision Technologies

Demand Generation Manager

Jobgether

PLM Engineer (Windchill / Teamcenter)

Bright Vision Technologies

Subscribe our newsletter