API Developer (Remote) - AI Agent Evaluation & LLM Training

quik hire staffing • Canada
Remote
Apply
AI Summary

This role involves designing and evaluating autonomous AI agents across multiple LLMs for real-world domains. Key responsibilities include writing evaluation rubrics, debugging agent traces, and assessing production-grade software architecture. Candidates must have backend engineering experience, proficiency in at least two major programming languages, and familiarity with SQL databases.

Key Highlights
Shape the future of agentic AI systems by providing expert human feedback to leading AI organizations.
Work fully remotely from the United States, Canada, United Kingdom, or Australia.
Hourly compensation ranges from USD $30 to $50, paid weekly via PayPal or AirTM.
Key Responsibilities
Write evaluation rubrics with objective pass/fail criteria.
Debug agent traces to identify failure patterns.
Stress test agents against edge cases, prompt injection, and tool misuse.
Assess production-grade modular software architecture.
Analyze multi-turn system interactions and behaviors.
Provide high-density technical feedback for LLM training.
Technical Skills Required
Python JavaScript Go Java SQL
Benefits & Perks
Fully remote work
Hourly compensation of USD $30-$50
Weekly payments via PayPal or AirTM
Nice to Have
Experience integrating agents with live tools such as Supabase, Gmail, and other APIs.
Familiarity with persistent state and session-tracking patterns.
Experience identifying privacy leaks, authority escalation, or indirect prompt injection vulnerabilities.

Job Description


  • Job Title: API Developer (Remote)
  • Location: Remote (United States, Canada, United Kingdom, Australia)
  • Work Mode: Fully Remote


Role Overview

Help design and evaluate autonomous AI agents across multiple LLMs, spanning health, education, daily life, and other real-world domains (all coding work). Shape the future of agentic AI systems by providing expert human feedback to leading AI organisations. Help train Large Language Models (LLMs) for complex, multi-step architectural workflows.


Key Responsibilities

AI Agent Evaluation

  • Write evaluation rubrics with objective pass/fail criteria
  • Debug agent traces to identify failure patterns
  • Stress test agents against edge cases, prompt injection, and tool misuse

Technical Assessment

  • Assess production-grade modular software architecture
  • Analyse multi-turn system interactions and behaviours
  • Provide high-density technical feedback for LLM training

Project Workflow

  • Create an account and upload a resume/ID
  • Complete the onboarding assessment
  • Start earning through flexible task assignments


Qualifications

  • Experience in backend engineering, AI automation, or complex systems integration
  • Proven ability to build and maintain production-grade software with modular separation (e.g., distinct services for data parsing, logic processing, and reporting)
  • Strong command of at least two major languages (e.g., Python, JavaScript, Go, or Java) and experience working with SQL databases
  • Practical experience building for live, non-mocked environments and handling multi-turn system interactions


Preferred (Nice to Have)

  • Experience integrating agents with live tools such as Supabase, Gmail, and other APIs
  • Familiarity with persistent state and session-tracking patterns
  • Experience identifying privacy leaks, authority escalation, or indirect prompt injection vulnerabilities


Compensation

  • Hourly compensation ranges from USD $30–$50, depending on experience and task complexity
  • Payments are issued weekly via supported payout platforms (e.g., PayPal or AirTM)
  • Full compensation details are provided prior to task acceptance


Equal Opportunity Statement

Selection decisions are based solely on skills, qualifications, and project requirements. We are committed to inclusive and fair engagement practices and consider all qualified applicants without regard to legally protected characteristics.


Apply Now!


Similar Jobs

Explore other opportunities that match your interests

Senior Client Partner, Enterprise Sales

Programming
•
1d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

LivePerson

Canada

Staff Frontend Engineer

Programming
•
2d ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

DualEntry

Canada

Director of Growth & Demand Generation

Programming
•
2d ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

pubgenius inc.

Canada

Subscribe our newsletter

New Things Will Always Update Regularly