Research Program Manager

reflection United Kingdom
Visa Sponsorship
Apply
AI Summary

We are seeking a Research Program Manager to lead cross-functional programs and drive the development of our research infrastructure. The ideal candidate will have 7+ years of experience in technical program management and deep technical knowledge of distributed systems and ML/AI. The role will focus on scaling our research infrastructure to support massive training runs and ensuring end-to-end coordination across multiple teams.

Key Highlights
Lead cross-functional programs to drive research infrastructure development
Drive end-to-end coordination across multiple teams
Ensure infrastructure investments are tied to research velocity
Key Responsibilities
Own cross-functional programs spanning training infrastructure and cluster reliability
Drive end-to-end coordination scaling our training stack alongside engineering leads and external partners
Jump into active incidents and escalations to triage, coordinate response, and drive resolution across teams
Technical Skills Required
Technical program management Distributed systems ML/AI
Benefits & Perks
Top-tier compensation
Stock options
Comprehensive medical, dental, vision, and life insurance
Annual wellness allowance
Unlimited paid time off

Job Description


Our Mission

Reflection is a research lab making intelligence open and accessible for everyone to use, customize, and build on. We build open models that let anyone control their intelligence and help shape the future of AI. Our mission: make intelligence open and accessible to all.

About The Role

Research Program Managers at Reflection are high-leverage leaders and operators who embed directly with research and infrastructure teams to accelerate the pace of frontier model development. They are not project trackers. They are force multipliers who bring clarity to ambiguity, drive decisions when the path forward is unclear, and ensure that the work happening across multiple teams connects into a coherent whole.

This role focuses on scaling our research infrastructure to support massive, frontier-scale training runs across pre-training, mid-training, and post-training. You will work closely with teams building on training libraries like Megatron, driving the programs that turn raw clusters into reliable, high-performance training environments. Your job is to make sure the infrastructure we build works end-to-end, that teams are unblocked, and that we can scale with confidence as our ambitions grow.

You bring a first-responder mentality. When things go sideways, you don't wait to be asked. You jump in, assess the situation, cut through noise, align the people who need to be aligned, and drive resolution.

What You'll Do

  • Own cross-functional programs spanning training infrastructure and cluster reliability across pre-training, mid-training, and post-training workstreams.
  • Drive end-to-end coordination scaling our training stack alongside engineering leads and external partners.
  • Jump into active incidents and escalations to triage, coordinate response, and drive resolution across teams. Champion a culture of blameless post-mortems and continuous learning, turning every incident into a concrete improvement to our systems and processes.
  • Partner with infrastructure and research engineering leads to identify bottlenecks, define priorities, and ensure that infrastructure investments are directly tied to research velocity.
  • Build and maintain visibility into training run health, cluster reliability, and infrastructure performance so that leadership and teams have the context they need to make fast, informed decisions.
  • Create lightweight, durable processes for cross-team handoffs, config management, checkpoint workflows, and other coordination-heavy touchpoints that currently rely on ad hoc communication.
  • Translate technical complexity into clear status updates and decision frameworks for engineering leadership and executives.

About You

  • 7+ years of experience in technical program management, research operations, or infrastructure coordination, ideally in ML/AI or large-scale distributed systems environments.
  • Deep technical knowledge to engage with engineers on topics like distributed training frameworks, GPU cluster architecture, scheduler behavior, networking, and storage systems. You don't need to write the code, but you need to understand the systems to “speak the language”, i.e., to ask the right questions and identify risks early.
  • Proven ability to operate effectively in high-ambiguity, fast-moving environments. You create structure where there is none and drive clarity without waiting for permission.
  • Track record of managing complex, multi-team programs with competing priorities and hard deadlines. You know how to make tradeoffs and you communicate them clearly.
  • Strong stakeholder management skills across both deeply technical ICs and senior leadership. You build trust by being reliable, direct, and well-informed.
  • Comfortable operating in crisis mode. You stay calm under pressure, you know how to prioritize when everything is on fire, and you follow through on the other side.
  • Excited to build from zero to one. We are a small, fast-moving team and this role will help define how Research Program management Works at Reflection.
  • Motivated by enabling researchers and engineers to build the world's most capable open-weight AI systems.

What We Offer

We believe that to make intelligence open and accessible to all, you need to start at the foundation. Joining Reflection means building from the ground up as part of a talent-dense team. You will help define our future as a company, and help define the future of open foundational models.

We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported.

  • Top-tier compensation: Salary and equity structured to recognize and retain our talent globally.
  • Stock options: Everyone who joins and contributes to Reflection's success gets to share in the upside through stock options.
  • Health & wellness: Comprehensive medical, dental, vision, and life, with an annual wellness allowance.
  • Meals: Lunch and dinner are provided in the office daily.
  • Life & family: 22 weeks paid parental leave for all new birthing and non-birthing parents, including adoptive and surrogate journeys.
  • Vacation days: Unlimited paid time off in the U.S. and 30 days in the U.K.
  • Sponsorship support: We sponsor visas to help exceptional talent join our team and support long-term immigration pathways where applicable.
  • Team building: We have regular off-sites, happy hours, and team celebrations.


Similar Jobs

Explore other opportunities that match your interests

Audio Service Engineer

Programming
1h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Not Applicable

recruitment helpline ltd

United Kingdom

Senior Software Engineer

Programming
1d ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

tech talent partners

United Kingdom

Product Engineer

Programming
1d ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

tech talent network

United Kingdom

Subscribe our newsletter

New Things Will Always Update Regularly