Job Description
Role: AI/LLM Test Engineer
12+ months fulltime contract
100% remote
Position Overview
We are seeking a detail-oriented and highly skilled AI/LLM Test Engineer to join our team. The ideal candidate will play a critical role in ensuring the quality, performance, and reliability of AI/LLM-based applications by designing, developing, and executing testing strategies tailored to large language models and AI systems. This role demands a strong understanding of AI/ML concepts, software testing methodologies, and automation frameworks.
Key Responsibilities
- Test Strategy and Planning
- Design and implement comprehensive test plans for AI and LLM systems, focusing on performance, accuracy, and reliability.
- Collaborate with product managers, developers, and data scientists to understand requirements and define testing objectives.
- Testing Execution
- Conduct functional, regression, performance, and scalability testing on AI/LLM models and applications.
- Validate model outputs against benchmarks, metrics, and real-world scenarios to ensure alignment with business needs.
- Automation and Tools
- Develop and maintain automated testing frameworks and scripts for LLM-based applications.
- Leverage tools for model evaluation, including adversarial testing, bias detection, and output validation.
- Monitoring and Reporting
- Analyze testing results, identify issues, and provide actionable insights to improve model performance.
- Report and document bugs, inconsistencies, and edge cases to the development and data science teams.
- Collaboration and Improvement
- Work closely with cross-functional teams to improve the AI/LLM systems' reliability and user experience.
- Stay updated with the latest advancements in AI/ML testing methodologies and tools.
Required Skills and Qualifications
- Education: Bachelor’s or Master’s degree in Computer Science, AI/ML, or a related field.
- Experience:
- 2+ years of experience in software testing or quality assurance, preferably with AI/ML applications.
- Hands-on experience with testing AI/LLM models, APIs, and associated platforms.
Technical Skills:
- Strong programming skills in Python, Java, or similar languages.
- Familiarity with AI/ML frameworks like TensorFlow, PyTorch, or Hugging Face Transformers.
- Proficiency in automation tools like Selenium, Appium, or Pytest.
- Knowledge of model evaluation techniques, including accuracy, precision, recall, and F1 score.
Analytical Skills: Ability to design edge cases, adversarial scenarios, and test model robustness.
Soft Skills: Strong communication, problem-solving, and collaboration skills.
Preferred Skills
- Knowledge of NLP, embeddings, and tokenization processes.
- Familiarity with version control systems (e.g., Git).
- Experience working with cloud platforms like AWS, Azure, or GCP for AI/ML workloads.
- Exposure to bias and fairness testing in AI/ML systems.