Evaluate AI model performance, identify bugs, and provide expert-level feedback. 3+ years of software engineering experience required. Strong proficiency in at least one programming language.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
About The Role
What if your engineering instincts could directly influence how the world's most advanced AI models write code? We're looking for experienced software engineers to put frontier AI systems through rigorous evaluation — catching bugs, exposing logical failures, and providing the expert-level feedback that makes these models smarter and more reliable.
This is a fully remote, flexible contract role built for engineers who think critically, debug systematically, and aren't content to just accept what a model outputs at face value.
- Organization: Alignerr
- Type: Hourly Contract
- Location: Remote
- Commitment: 10–40 hours/week
- Evaluate the performance of frontier AI language models on complex, real-world software engineering tasks
- Identify bugs, logical errors, hallucinations, and reliability issues in AI-generated code
- Design and review prompts, test cases, and evaluation scenarios that push models to their limits
- Write precise, expert-level feedback explaining model strengths, weaknesses, and failure modes
- Assess AI outputs across multiple programming languages and codebases for correctness and generalization
- Think like a rigorous code reviewer — not just a user — and hold AI to a high engineering standard
Interested in remote work opportunities in Development & Programming? Discover Development & Programming Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
- 3+ years of professional software engineering experience
- Strong proficiency in at least one of: TypeScript, Ruby, Java, or C++
- A sharp debugger — you catch non-obvious issues and can explain exactly why something is wrong
- Excellent written communication skills in English
- Able to reason about complex systems and evaluate edge cases with precision
- Familiar with modern development workflows — Git, CLI tooling, testing frameworks
- Critical by nature: you evaluate model behavior rather than simply trust model outputs
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
- Experience across multiple programming languages or polyglot codebases
- Familiarity with AI/LLM tooling or prior work in AI evaluation
- Background in code review, QA engineering, or technical writing
- Comfort working with ambiguous or novel problem types
- Work on cutting-edge AI projects alongside leading research labs
- Fully remote and flexible — work when and where it suits you
- Freelance autonomy with the structure of meaningful, high-impact technical work
- Make a direct, tangible impact on how AI writes and reasons about code at scale
- Potential for ongoing work and contract extension as new projects launch
Similar Jobs
Explore other opportunities that match your interests
Miratech
Software Engineer
schireson
Senior C++ Engineer - AI Code Review & Reference Implementation