Design and implement robust CI/CD pipelines for machine learning workflows. Build scalable evaluation harnesses and develop internal SDKs and CLIs. Implement comprehensive tracking for model performance and reliability.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
Think Different. Build the Future. 🚀
Our Mission
Build everyday AGI. Trustworthy, consumer-grade agents that redefine human–AI collaboration for millions. Software shouldn’t wait for commands; it should partner with you, amplifying what you can do every single day.
Why AGI, Inc.
We’re a stealth team of elite founders and AI researchers, with backgrounds spanning Stanford, OpenAI, and DeepMind. We’re industry leaders in mobile and computer-use agents, bringing these capabilities to consumer scale.
Grounded in years of agent research, our AI is designed with trustworthiness and reliability as core pillars, not afterthoughts.
We are supported by tier-1 investors who funded the first generation of AI giants; now they’re backing us to build the next: everyday AGI. (Watch the demo)
If you see possibility where others see limits, read on.
What You’ll Do
Training Automation: Design and implement robust CI/CD pipelines for machine learning workflows. Automate nightly and on-demand training runs, including data ingestion, job orchestration, checkpointing, and artifact management, with reliability as a first-class requirement.
Evaluation Infrastructure: Build scalable evaluation harnesses that automatically benchmark models on every merge. Optimize latency and resource usage so experimentation stays fast, and performance regressions are caught immediately.
Research Tooling: Develop internal SDKs, CLIs, and lightweight UIs (e.g., Streamlit, Retool) that empower researchers to:
- Inspect trajectories and traces
- Visualize model failures
- Curate and manage datasets
- Iterate without friction
Looking to advance your Devops career with relocation support? Explore Devops Jobs with Relocation Packages that include comprehensive packages to help you move and settle in your new role.
Observability & Performance: Implement comprehensive tracking for:
- Model latency, throughput, and error rates
- GPU utilization and cluster health
- Inference cost and unit economics
Minimum Qualifications
- Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
- 3+ years in Software Engineering, MLOps, or ML Infrastructure
- Strong Python proficiency
- Experience building internal developer tools, CLIs, or dashboards
- Experience with cloud infrastructure (AWS or GCP) and containerization (Docker, Kubernetes)
Discover our full range of relocation jobs with comprehensive support packages to help you relocate and settle in your new location.
- Experience designing CI/CD pipelines specifically for ML workflows
- Familiarity with LLM serving stacks such as vLLM or TGI
- Experience managing GPU clusters and optimizing distributed workloads
Great research without great infrastructure slows to a crawl.
Great infrastructure multiplies the impact of every researcher.
You will define how experiments scale, how reliability is measured, and how quickly we can ship improvements to real users. The systems you build will directly shape the speed and quality of our progress toward everyday AGI.
Interested in relocating to United State? Check out our comprehensive Relocation Jobs in United State page with detailed relocation packages and benefits.
🏢 All in, in person — work moves faster face-to-face
🚀 Ship by default — novel and polished can coexist, speed is the feature
🤝 One band, one sound — radical candor, zero politics, help each other win
Perks
🏥 Competitive company-sponsored medical, dental, and vision insurance
✈️ Top-tier relocation and immigration support
How To Apply
Send us:
- A link — or 60-second video — of something you built and why it matters
- Your resume or LinkedIn
- Two sentences on the hardest problem you've cracked
If you see possibility where others see limits, we'd love to meet you.
Similar Jobs
Explore other opportunities that match your interests
hirepower staffing solution
AI/ML Software Engineer
primehire recruiting
Senior Site Reliability Engineer