We're seeking a Senior Applied AI/ML Engineer to own the foundation model layer of our client's stack, fine-tune open-weight large language models, and design model pipelines. The role involves applying classical machine learning and building data infrastructure. The ideal candidate has 4-6 years of experience in production machine learning systems and a strong foundation in classical machine learning and forecasting.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
Company Description
Crudcook is a specialist talent solutions partner focused on senior technical hiring for high-growth technology and fintech companies. We work with venture-backed startups, scaling businesses, and category-defining teams to identify, engage, and place senior engineering and AI/ML talent across India and globally. Our approach combines deep technical understanding with a curated, relationship-led search process — we partner with a small, focused set of clients to ensure depth, quality, and meaningful candidate experiences. Our team has supported hiring across some of the most innovative startups and global technology organizations, with a track record of placing senior engineers, machine learning practitioners, and technical leaders into roles where they can do their best work. We believe great hiring is about fit, not filtering — and we're committed to a candidate-first process that respects both the time and the trajectory of the people we represent.
Role Description
This is a remote role for an Applied AI/ML Engineer (Foundation Models & Data), hired on behalf of our client — a well-funded fintech building payments infrastructure at scale. The Machine Learning Engineer will own the foundation model layer of the company's stack end-to-end, including fine-tuning open-weight large language models on proprietary transaction, partner, and operational data; designing model pipelines that move from raw event data to production inference; and building the data infrastructure that supports the full machine learning lifecycle. The role also involves applying classical machine learning where it is the right tool — including liquidity and volume forecasting, anomaly detection across transaction flows, and partner behavior modeling. The Machine Learning Engineer will work closely with the founding team, senior engineers, and cross-functional stakeholders to ship reliable, production-grade systems in a regulated, latency-sensitive domain. This is a first-ML-hire role, which means the scope is unusually broad, the ownership is real, and the engineer will help build the machine learning function from the ground up. Collaboration, systems thinking, technical leadership, and continuous learning are core aspects of this role.
Qualifications
Interested in remote work opportunities in Machine Learning & AI? Discover Machine Learning & AI Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
- 4–6 years of experience building production machine learning systems, with significant hands-on work on transformer-based models
- Demonstrable experience fine-tuning open-weight LLMs (Llama, Qwen, Mistral, Gemma) using techniques such as LoRA, QLoRA, full fine-tuning, DPO, ORPO, or continued pre-training
- Deep understanding of transformer architecture, including attention mechanisms, positional encodings, tokenization tradeoffs, and context length considerations
- Proven track record of shipping at least one fine-tuned LLM to production
- Strong foundation in classical machine learning and forecasting — gradient boosting, time-series methods (Prophet, statsforecast, SARIMA), and statistical reasoning
- Experience designing and optimizing machine learning models across both classical and deep learning paradigms
- Proficiency in Python, with fluency in PyTorch and the Hugging Face ecosystem (transformers, peft, trl, datasets)
- Hands-on experience with at least one inference server such as vLLM, TGI, or SGLang
- Real data engineering capability — SQL fluency, pipeline orchestration, schema design, and familiarity with feature store concepts including point-in-time correctness and online/offline parity
- Comfort with Google Cloud Platform (Vertex AI, GKE, BigQuery, GCS) or equivalent experience on AWS or Azure
- Strong foundation in computer science, algorithms, statistics, and applied mathematics
- Strong analytical, problem-solving, and design-documentation skills
- Bachelor's or Master's degree in Computer Science, Machine Learning, Statistics, or a related field
- Experience with agent frameworks, Model Context Protocol (MCP), tool-use evaluation, or multi-agent orchestration is a plus
- Background in fintech, payments, fraud, or other regulated domains is a plus
- Open-source contributions to machine learning or LLM tooling is a plus
- Distributed training experience (FSDP, DeepSpeed, multi-node) is a plus
- Experience with liquidity, treasury, or financial forecasting in a payments or trading context is a plus
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
Looking forward for your application!
Similar Jobs
Explore other opportunities that match your interests
para ai labs
Mercor