Frontier AI Model Evaluation UI Developer

jetbridge ai • United State

Relocation

Apply

AI Summary

Build UI for exploring LLM evaluation results and experiment outputs. Design data visualizations and implement end-to-end traceability of LLM runs. Partner with researchers to iterate quickly while balancing clarity, accuracy, and performance.

Key Highlights

Build UI for exploring LLM evaluation results

Design data visualizations

Implement end-to-end traceability of LLM runs

Technical Skills Required

React TypeScript D3 Plotly Vega/Vega-Lite Visx Three.js Highcharts ECharts

Benefits & Perks

Relocation sponsored

On-site in San Francisco

Job Description

Our Client is a well-funded nonprofit research organization focused on measuring frontier AI capabilities—especially agentic / autonomous capabilities and the ability of models to conduct AI R&D, because those capabilities can create outsized societal and security risk if they scale faster than our ability to evaluate and govern them.

Their work is unusually “real-world” compared to typical benchmarks: they build evaluations with high realism and measure performance against skilled-human baselines (often multi-hour tasks), and publish research on how quickly models are improving at completing long tasks.

You’d be building the UI that turns messy LLM evaluation outputs into clear, explorable artifacts that researchers can trust.

What you’ll do

- Build React + TypeScript interfaces for exploring LLM evaluation results and experiment outputs.

- Design and implement data visualizations that make model behavior, metrics, and results easy to inspect.

- Build workflows that support end-to-end traceability of LLM runs (prompts → intermediate steps → decisions → outputs).

- Partner closely with researchers; iterate quickly while balancing clarity, accuracy, and performance.

Tech stack / must-haves

- React + TypeScript

- Hands-on with at least one major visualization library: D3, Plotly, Vega/Vega-Lite, Visx, Three.js, Highcharts, ECharts

Why this matters

- Their mission is to give society and AI labs grounded answers to: “What can frontier models actually do?” and “When do capabilities become dangerous?”

- The team includes researchers and engineers with backgrounds across top AI orgs and programs (e.g., OpenAI, DeepMind, and alumni of Oxford, Caltech, MIRI, and ML interpretability programs).

Location

- On-site in San Francisco (relocation sponsored).

Job Overview

Posted Date Dec 31, 2025

Employment Type Full-time

Experience Level Entry level

Location United State

Category Programming

Company jetbridge ai

Frontier AI Model Evaluation UI Developer

Key Highlights

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Frontier AI Model Evaluation UI Developer

Key Highlights

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Subscribe our newsletter