Senior/Staff Engineer for Distributed ML Model Training

pluralis research • Australia

Remote Visa Sponsorship

Apply

AI Summary

We're looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large-scale training. You'll be implementing a novel substrate for training distributed ML models that work under consumer grade internet connection. Design and implement large-scale distributed training systems optimized for heterogeneous hardware operating under low-bandwidth, high-latency conditions.

Key Highlights

Distributed Training Architecture & Optimization

Decentralized Networking & Resilience

Strong experience building and operating distributed systems in production

Key Responsibilities

Design and implement large-scale distributed training systems optimized for heterogeneous hardware operating under low-bandwidth, high-latency conditions.

Implement robust checkpointing, state synchronization, and recovery mechanisms for long-running, fault-prone training jobs.

Build monitoring and metrics systems to track training progress, model quality, and system bottlenecks.

Technical Skills Required

Distributed systems ML large-scale training Python FSDP DeepSpeed Megatron P2P systems gRPC NAT traversal distributed coordination GPU utilization memory efficiency compute performance

Benefits & Perks

Equity-heavy compensation

Competitive base salary for senior engineering roles in Australia

Visa sponsorship available

Remote-first with optional access to our Melbourne hub

Job Description

Overview

Pluralis Research carries out foundational research on Protocol Learning: multi-participant training of foundation models where no single participant has, or can ever obtain, a full copy of the model. The purpose of Protocol Learning is to facilitate the creation of community-trained and community-owned frontier models with self-sustaining economics.

We're looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large-scale training. You'll be implementing a novel substrate for training distributed ML models that work under consumer grade internet connection.

Responsibilities

Distributed Training Architecture & Optimization

Design and implement large-scale distributed training systems optimized for heterogeneous hardware operating under low-bandwidth, high-latency conditions.
Develop and optimize model-parallel training strategies (data, tensor, pipeline parallelism) with custom sharding techniques that minimize communication overhead.
Optimize GPU utilization, memory efficiency, and compute performance across distributed nodes.
Implement robust checkpointing, state synchronization, and recovery mechanisms for long-running, fault-prone training jobs.
Build monitoring and metrics systems to track training progress, model quality, and system bottlenecks.

Searching for Machine Learning & AI roles that provide visa sponsorship? Connect with international employers through Machine Learning & AI Jobs with Visa Sponsorship opportunities actively seeking talented professionals.

Decentralized Networking & Resilience

Architect resilient training systems where nodes can fail, networks can partition, and participants can dynamically join or leave.
Design and optimize peer-to-peer topologies for decentralized coordination across non-co-located nodes.
Implement NAT traversal, peer discovery, dynamic routing, and connection lifecycle management.
Profile and optimize communication patterns to reduce latency and bandwidth overhead in multi-participant environments.

Explore our comprehensive directory of visa sponsorship jobs from employers worldwide who are ready to sponsor talented international professionals.

What You’ll Bring

Strong experience building and operating distributed systems in production.
Hands-on expertise with distributed training frameworks (FSDP, DeepSpeed, Megatron, or similar).
Deep understanding of model parallelism (data, tensor, pipeline parallelism).
Expert-level Python with production experience (concurrency, error handling, retry logic, clean architecture).
Strong networking fundamentals: P2P systems, gRPC, routing, NAT traversal, distributed coordination.
Experience optimizing GPU workloads, memory management, and large-scale compute efficiency.

Interested in opportunities specifically in Australia? Discover our dedicated Visa Sponsorship Jobs in Australia page featuring roles from top employers in this location.

What we offer

Equity-heavy compensation with meaningful ownership in a mission-driven company
Competitive base salary for senior engineering roles in Australia
Visa sponsorship available for exceptional candidates
Remote-first with optional access to our Melbourne hub
World-class team — team mates were previously at at Google, Amazon, Microsoft, and leading startups

Backed by Union Square Ventures and other tier-1 investors, we're a world-class, deeply technical team of ML researchers and engineers. Pluralis is unapologetically ideological. We view the world as a better place if we are able to implement what we are attempting, and Protocol Learning as the only plausible approach to preventing a handful of massive corporations monopolising model development, access and release, and achieving massive economic capture. If this resonates, please apply.

Job Overview

Posted Date Feb 23, 2026

Employment Type Full-time

Experience Level Mid-Senior level

Location Australia

Annual Salary 138,550 USD

Category Machine Learning

Company pluralis research

Mentioned Skills

Industries

Similar Jobs

Explore other opportunities that match your interests

Full Stack AI Engineer

Machine Learning

•

6d ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

synoptix ai

Australia

Mid-Level Machine Learning Engineer

Machine Learning

•

2w ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

va talks consulting pvt. ltd.

Australia

Head of Artificial Intelligence

Machine Learning

•

3w ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Director

caliber8 recruitment

Australia

Senior/Staff Engineer for Distributed ML Model Training

Key Highlights

Key Responsibilities

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Full Stack AI Engineer

synoptix ai

Mid-Level Machine Learning Engineer

va talks consulting pvt. ltd.

Head of Artificial Intelligence

caliber8 recruitment

Subscribe our newsletter