AI Performance Optimization Engineer

Jobgether • United State

Remote

Apply

AI Summary

Jobgether is seeking an AI Performance Optimization Engineer to improve the performance, efficiency, and scalability of AI training and inference systems. The ideal candidate has deep expertise in AI systems, performance engineering, and large-scale distributed computing environments. This role requires strong programming skills in Python and C++, hands-on experience optimizing deep learning workloads on modern GPU architectures, and a deep understanding of distributed training, inference systems, and model parallelism techniques.

Key Highlights

Improve AI system performance and efficiency

Optimize distributed training and inference

Develop advanced model optimization techniques

Key Responsibilities

Improve the performance, efficiency, and scalability of AI training and inference systems

Profile and optimize end-to-end AI pipelines

Identify bottlenecks across compute, memory, networking, and data pipelines

Technical Skills Required

Python C++ GPU architectures Distributed training Inference systems Model parallelism techniques Quantization Sparsity Pruning Compression Triton XLA TorchInductor TVM

Benefits & Perks

Competitive full-time compensation

Fully remote work model

Long-term, stable engineering engagement

Opportunity to work on cutting-edge AI infrastructure challenges

Job Description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for an AI Performance Optimization Engineer in the United States.

This role focuses on pushing the limits of performance for large-scale AI systems, with an emphasis on maximizing throughput, minimizing latency, and reducing operational costs across training and inference workloads. You will work across the full stack of AI infrastructure, from GPU-level kernel optimization to distributed system tuning and model serving architecture. The environment is highly technical, data-driven, and collaborative, involving close partnership with ML engineers, platform teams, and product stakeholders. You will help translate complex, ambiguous performance challenges into measurable engineering improvements. The role is ideal for a hands-on expert who thrives in deep systems work and production-grade optimization. You will also contribute to shaping standards, benchmarks, and best practices across AI infrastructure teams.

Accountabilities

You will be responsible for improving the performance, efficiency, and scalability of AI training and inference systems across distributed environments.

Profile and optimize end-to-end AI pipelines to improve throughput, latency, and cost efficiency.
Identify bottlenecks across compute, memory, networking, and data pipelines, and implement targeted optimizations.
Develop and tune advanced model optimization techniques such as quantization, sparsity, pruning, and compression.
Optimize distributed training and inference using parallelism strategies (tensor, pipeline, FSDP, ZeRO).
Improve LLM serving performance through techniques such as KV caching, batching, and speculative decoding.
Drive kernel and compiler-level optimizations using tools like Triton, XLA, TorchInductor, or TVM.
Build benchmarking frameworks, performance monitoring systems, and regression testing suites.
Collaborate with cross-functional engineering teams to integrate performance best practices into production systems.
Evaluate hardware and software technologies and guide adoption decisions based on performance trade-offs.
Document optimization strategies and contribute to internal knowledge sharing and technical leadership.

Interested in remote work opportunities in Development & Programming? Discover Development & Programming Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.

Requirements

The ideal candidate has deep expertise in AI systems, performance engineering, and large-scale distributed computing environments.

Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
6+ years of experience in ML systems, performance engineering, or high-performance computing.
Strong programming skills in Python and C++, with production-level engineering experience.
Hands-on experience optimizing deep learning workloads on modern GPU architectures.
Deep understanding of distributed training, inference systems, and model parallelism techniques.
Experience with profiling tools across CPU, GPU, and distributed systems.
Strong knowledge of memory hierarchies, communication overheads, and system bottlenecks.
Familiarity with model compression and optimization techniques and their trade-offs.
Strong analytical skills with a disciplined, measurement-driven engineering approach.
Excellent communication skills and ability to collaborate across technical and non-technical teams.

Benefits

Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.

Competitive full-time compensation aligned with experience and expertise
Fully remote work model across the United States
Long-term, stable engineering engagement on high-impact AI systems
Opportunity to work on cutting-edge large-scale AI infrastructure challenges
Collaborative, engineering-driven environment with strong technical ownership
Exposure to advanced GPU systems, LLM optimization, and distributed AI frameworks
Career growth opportunities in high-performance AI systems engineering.

How Jobgether Works

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Job Overview

Posted Date May 20, 2026

Employment Type Full-time

Experience Level Mid-Senior level

Location United State

Category Programming

Company Jobgether

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Senior AWS Architect - Digital Contact Center Solutions

Programming

•

3h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

Miratech

United State

Software Engineer

Programming

•

4h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

schireson

United State

Senior C++ Engineer - AI Code Review & Reference Implementation

Programming

•

4h ago

Premium Job

•••••• •••••• ••••••

Job Type ••••••

Experience Level ••••••

sme careers

United State

AI Performance Optimization Engineer

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Senior AWS Architect - Digital Contact Center Solutions

Miratech

Software Engineer

Premium Job

schireson

Senior C++ Engineer - AI Code Review & Reference Implementation

Premium Job

sme careers

Subscribe our newsletter