Senior Software Engineer for AI Model Evaluation

Alignerr United State
Remote
Apply
AI Summary

Evaluate AI model performance, identify bugs, and provide expert-level feedback. 3+ years of software engineering experience required. Strong proficiency in at least one programming language.

Key Highlights
Evaluate AI model performance
Identify bugs and provide feedback
Strong proficiency in at least one programming language
Key Responsibilities
Evaluate the performance of frontier AI language models
Identify bugs, logical errors, hallucinations, and reliability issues in AI-generated code
Write precise, expert-level feedback explaining model strengths, weaknesses, and failure modes
Technical Skills Required
TypeScript Ruby Java C++
Benefits & Perks
Fully remote work
Flexible contract role
Potential for ongoing work and contract extension
Nice to Have
Experience across multiple programming languages or polyglot codebases
Familiarity with AI/LLM tooling or prior work in AI evaluation

Job Description


About The Role

What if your engineering instincts could directly influence how the world's most advanced AI models write code? We're looking for experienced software engineers to put frontier AI systems through rigorous evaluation — catching bugs, exposing logical failures, and providing the expert-level feedback that makes these models smarter and more reliable.

This is a fully remote, flexible contract role built for engineers who think critically, debug systematically, and aren't content to just accept what a model outputs at face value.

  • Organization: Alignerr
  • Type: Hourly Contract
  • Location: Remote
  • Commitment: 10–40 hours/week

What You'll Do

  • Evaluate the performance of frontier AI language models on complex, real-world software engineering tasks
  • Identify bugs, logical errors, hallucinations, and reliability issues in AI-generated code
  • Design and review prompts, test cases, and evaluation scenarios that push models to their limits
  • Write precise, expert-level feedback explaining model strengths, weaknesses, and failure modes
  • Assess AI outputs across multiple programming languages and codebases for correctness and generalization
  • Think like a rigorous code reviewer — not just a user — and hold AI to a high engineering standard

Who You Are

  • 3+ years of professional software engineering experience
  • Strong proficiency in at least one of: TypeScript, Ruby, Java, or C++
  • A sharp debugger — you catch non-obvious issues and can explain exactly why something is wrong
  • Excellent written communication skills in English
  • Able to reason about complex systems and evaluate edge cases with precision
  • Familiar with modern development workflows — Git, CLI tooling, testing frameworks
  • Critical by nature: you evaluate model behavior rather than simply trust model outputs

Nice to Have

  • Experience across multiple programming languages or polyglot codebases
  • Familiarity with AI/LLM tooling or prior work in AI evaluation
  • Background in code review, QA engineering, or technical writing
  • Comfort working with ambiguous or novel problem types

Why Join Us

  • Work on cutting-edge AI projects alongside leading research labs
  • Fully remote and flexible — work when and where it suits you
  • Freelance autonomy with the structure of meaningful, high-impact technical work
  • Make a direct, tangible impact on how AI writes and reasons about code at scale
  • Potential for ongoing work and contract extension as new projects launch

Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Miratech

United State

Software Engineer

Programming
6h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

schireson

United State

Senior C++ Engineer - AI Code Review & Reference Implementation

Programming
6h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

sme careers

United State

Subscribe our newsletter

New Things Will Always Update Regularly