Senior ML Systems Engineer

Coders Connect United Kingdom
Visa Sponsorship
Apply
AI Summary

Design, build, and optimize machine learning infrastructure for large-scale workflows. Collaborate with research and product teams to translate complex requirements into robust systems. Develop and optimize ML training infrastructure and data pipelines using modern distributed frameworks.

Key Highlights
Build and evolve a data platform for managing large-scale multimodal datasets
Develop and optimize ML training infrastructure and data pipelines
Create tools and systems for dataset inspection, model evaluation, and experimentation workflows
Key Responsibilities
Design, build, and optimize the infrastructure that powers large-scale machine learning workflows
Build and evolve a data platform for managing large-scale multimodal datasets
Develop and optimize ML training infrastructure and data pipelines
Technical Skills Required
Python C++ Java Scala PyTorch Spark Ray
Benefits & Perks
Competitive base salary aligned to experience
Equity package with strong growth potential
Visa sponsorship available
Hybrid working model with flexibility

Job Description


This role combines advanced machine learning infrastructure expertise with strong systems engineering capabilities to deliver scalable, production-ready AI solutions. The focus is on building high-performance platforms that enable the training, evaluation, and deployment of cutting-edge models across large multimodal datasets, directly impacting real-world applications in a rapidly evolving domain.


As the organization continues to scale its AI capabilities, it is hiring a Senior ML Systems Engineer to take ownership of its end-to-end ML infrastructure. This is a high-impact, hybrid role suited to someone who can operate across data engineering, model infrastructure, and production systems. You will work closely with research and product teams to translate complex requirements into robust, scalable systems that drive performance, reliability, and innovation.


Your Role

You will design, build, and optimize the infrastructure that powers large-scale machine learning workflows. From data ingestion and training pipelines to model deployment and inference optimization, you will operate across the full ML lifecycle. This role blends deep technical execution with cross-functional collaboration, ensuring systems are efficient, scalable, and aligned with business and product goals.


What You’ll Do

▪️ Build and evolve a data platform for managing large-scale multimodal datasets including video, embeddings, and metadata

▪️ Develop and optimize ML training infrastructure and data pipelines using modern distributed frameworks

▪️ Create tools and systems for dataset inspection, model evaluation, and experimentation workflows

▪️ Design and maintain infrastructure for model versioning, lifecycle management, and production promotion

▪️ Own and optimize production inference pipelines, including GPU utilization and parallelization strategies

▪️ Collaborate with research teams to productionize models and improve training efficiency

▪️ Ensure scalability, reliability, and performance across the ML platform

▪️ Implement best practices for monitoring, testing, and system observability

▪️ Contribute to architectural decisions and continuously improve platform design

▪️ Work in Agile environments and maintain clear technical documentation


What You Bring

▪️ Proven experience building production ML systems or ML infrastructure

▪️ Strong programming skills in Python and at least one robust production language such as C++, Java, or Scala

▪️ Experience with distributed computing frameworks and large-scale data processing

▪️ Hands-on experience with ML frameworks such as PyTorch and tools for scaling training workloads

▪️ Familiarity with modern data platforms and technologies such as Spark, Ray, or similar

▪️ Experience working with high-performance compute environments and GPU-based systems

▪️ Strong understanding of system design, scalability, and performance optimization

▪️ Product mindset with the ability to solve complex problems autonomously

▪️ Excellent communication skills and ability to collaborate with research and engineering teams


Why Join?

▪️ Own and shape the ML infrastructure behind cutting-edge AI systems

▪️ Work closely with highly technical teams on complex, high-impact challenges

▪️ Build scalable systems that directly influence product performance and innovation

▪️ Join a team that values autonomy, technical excellence, and continuous improvement


Package

▪️ Competitive base salary aligned to experience

▪️ Equity package with strong growth potential

▪️ Visa sponsorship available

▪️ Hybrid working model with flexibility


Location

▪️ London, United Kingdom

▪️ Hybrid role with in-office collaboration


Similar Jobs

Explore other opportunities that match your interests

Senior Data Architect

Programming
3h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

office for national statistics

United Kingdom

Principal Developer Team Lead

Programming
9h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

cambridge university press & a...

United Kingdom
Visa Sponsorship Relocation Remote
Job Type Internship
Experience Level Internship

targetjobs uk

United Kingdom

Subscribe our newsletter

New Things Will Always Update Regularly