Engineering AI Evaluator

Mercor • United Kingdom

Remote

This Job is No Longer Active This position is no longer accepting applications

AI Summary

Mercor is seeking an Engineering AI Evaluator to connect elite creative and technical talent with leading AI research labs. The ideal candidate will have a PhD in Engineering, deep expertise in Mechanical & Physical Systems Engineering, and significant experience using large language models.

Key Highlights

Evaluate LLM-generated responses for technical accuracy and applied reasoning

Annotate model responses by identifying strengths, areas of improvement, and inaccuracies

Assess clarity, structure, and appropriateness of explanations for different audiences

Technical Skills Required

Python Large Language Models (LLMs) Mechanical & Physical Systems Engineering Electrical, Electronic & Computer Engineering Chemical, Materials & Process Engineering Civil, Environmental & Infrastructure Engineering

Benefits & Perks

$73/hour

Remote work

Job Description

About The Job

Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey.

Position: Engineering AI Evaluator

Type: Contract

Compensation: $73/hour

Location: Remote

Role Responsibilities

Write and refine prompts to guide model behavior in engineering scenarios.
Evaluate LLM-generated responses to engineering-related queries for technical accuracy and applied reasoning.
Conduct fact-checking and verify technical claims using authoritative sources and domain knowledge.
Annotate model responses by identifying strengths, areas of improvement, and inaccuracies.
Assess clarity, structure, and appropriateness of explanations for different audiences.
Apply consistent evaluation standards by following taxonomies, benchmarks, and guidelines.

Qualifications

Must-Have

PhD in Engineering or a closely related field.
Deep expertise in Mechanical & Physical Systems Engineering, Electrical, Electronic & Computer Engineering, Chemical, Materials & Process Engineering, or Civil, Environmental & Infrastructure Engineering.
Significant experience using large language models (LLMs).
Excellent writing skills to explain complex engineering concepts.
Strong attention to detail.
Experience reviewing or editing technical or academic writing.

Preferred

Experience with applied research, industry engineering workflows, or systems design.
Prior experience with RLHF, model evaluation, or data annotation work.
Experience teaching, mentoring, or explaining engineering concepts to non-expert audiences.
Familiarity with evaluation rubrics, benchmarks, or structured review frameworks.

Application Process (Takes 20–30 mins to complete)

Upload resume
AI interview based on your resume
Submit form

Resources & Support

For details about the interview process and platform information, please check: https://talent.docs.mercor.com/welcome/welcome
For any help or support, reach out to: support@mercor.com

PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.

,

Job Overview

Posted Date Jan 09, 2026

Employment Type Part-time

Experience Level Not Applicable

Location United Kingdom

Annual Salary 73 USD

Category Programming

Company Mercor

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Data Infrastructure Engineer

Programming

•

3h ago

Visa Sponsorship Relocation Remote

Job Type Contract

Experience Level Mid-Senior level

Haystack

United Kingdom

Full Stack Developer (Remote)

Programming

•

9h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Not Applicable

scispot.io (yc s21)

United Kingdom

Mid-Level Software Developer - C#, PHP, SQL & Azure

Programming

•

17h ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Entry level

Haystack

United Kingdom

Engineering AI Evaluator

Key Highlights

Technical Skills Required

Benefits & Perks

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Data Infrastructure Engineer

Haystack

Full Stack Developer (Remote)

scispot.io (yc s21)

Mid-Level Software Developer - C#, PHP, SQL & Azure

Haystack

Subscribe our newsletter