AI Summary
Design, fine-tune, and evaluate large language models for downstream NLP use-cases. Implement robust data pipelines and develop scalable model training and inference pipelines. Productionize models and optimize inference throughput and cost.
Key Highlights
Design, fine-tune, and evaluate large language models for downstream NLP use-cases
Implement robust data pipelines for text preprocessing, annotation ingestion, and training dataset creation
Develop scalable model training and inference pipelines using Hugging Face / PyTorch / TensorFlow
Productionize models: containerize, deploy, and monitor LLM services with MLOps best practices
Optimize inference throughput and cost: quantization, model distillation, batching, GPU/CPU resource management, and vector-search integration
Technical Skills Required
Benefits & Perks
Fully remote, India-based role with flexible hours
Strong focus on asynchronous collaboration
Opportunity to own end-to-end LLM product features and influence architecture decisions
Learning-first culture with support for conferences, courses, and hands-on experimentation with cutting-edge ML tooling
Job Description
About The Opportunity
Industry: Enterprise AI / Natural Language Processing (NLP). Sector: Large Language Model (LLM) products and NLP-driven automation for B2B applications including conversational AI, semantic search, and knowledge ingestion. Remote role based in India delivering production-grade ML systems and inference services.
Primary Job Title: Senior Machine Learning Engineer — NLP & LLM
Role & Responsibilities
- Design, fine-tune and evaluate large language models for downstream NLP use-cases (chatbots, semantic search, summarization, QA) and deliver production readiness.
- Implement robust data pipelines for text preprocessing, annotation ingestion, and training dataset creation to support supervised and instruction-tuning workflows.
- Develop scalable model training and inference pipelines using Hugging Face / PyTorch / TensorFlow, including distributed training and mixed-precision optimization.
- Productionize models: containerize, deploy and monitor LLM services with MLOps best practices (CI/CD, model versioning, canary rollout, observability).
- Optimize inference throughput and cost: quantization, model distillation, batching, GPU/CPU resource management and vector-search integration (FAISS).
- Collaborate with product managers, ML researchers and engineers to define evaluation metrics, A/B tests and continuous improvement loops for model quality and safety.
Must-Have
- 4+ years of hands-on ML engineering experience with strong emphasis on NLP and LLM workflows.
- Proven experience fine-tuning and deploying transformer-based models (Hugging Face Transformers).
- Strong Python engineering skills and deep familiarity with PyTorch or TensorFlow.
- Experience building production ML pipelines, containerization (Docker) and cloud deployment (AWS / GCP) with monitoring and CI/CD.
- Experience with LangChain or similar orchestration for model agents and tool-calling.
- Hands-on knowledge of vector search systems (FAISS) and inference optimization techniques (quantization, distillation).
- Fully remote, India-based role with flexible hours and a strong focus on asynchronous collaboration.
- Opportunity to own end-to-end LLM product features and influence architecture decisions in a fast-paced AI team.
- Learning-first culture with support for conferences, courses and hands-on experimentation with cutting-edge ML tooling.
Skills: nlp,ml,llm