AI Summary
Orbion Infotech is seeking a Senior Machine Learning Engineer to design, fine-tune, and deploy LLM-based NLP models and pipelines. The ideal candidate will have 5+ years of hands-on ML/NLP engineering experience and a proven track record of delivering end-to-end model solutions in cloud environments.
Key Highlights
Design, fine-tune, and deploy LLM-based NLP models and pipelines
Develop and productionize inference services with low-latency vector search integrations
Implement end-to-end MLOps for LLMs: containerised model serving, CI/CD for models, automated retraining, monitoring, and observability
Technical Skills Required
Benefits & Perks
Remote-first, flexible work model
Opportunity to work on cutting-edge LLM products and shape ML architecture end-to-end
Mentorship culture, regular tech talks, and focus on measurable impact and career growth
Job Description
Senior Machine Learning Engineer - NLP & LLM
About The Opportunity
We operate in the Enterprise AI and NLP sector, building production-grade Large Language Model (LLM) solutions and intelligent language services for B2B customers across search, automation, and analytics. The team focuses on deploying reliable, scalable LLM-powered systems that deliver retrieval-augmented generation (RAG), semantic search, and conversational AI in cloud-native environments. This is a fully remote role based in India.
Role & Responsibilities
- Design, fine-tune, and deploy LLM-based NLP models and pipelines (training ➜ evaluation ➜ inference) to power production features such as RAG, question answering, and summarization.
- Develop and productionize inference services with low-latency vector search integrations (embeddings ➜ FAISS/Pinecone/Milvus) and document retrieval layers.
- Implement end-to-end MLOps for LLMs: containerised model serving, CI/CD for models, automated retraining, monitoring, and observability of drift and performance.
- Integrate LLM toolchains (Hugging Face, LangChain) into backend APIs and orchestrate multi-step agent workflows for task automation and conversational agents.
- Optimize model cost and performance: quantization, pruning, batching, and autoscaling for GPU/CPU inference in cloud environments.
- Collaborate with product, data engineering, and QA to define success metrics, run A/B experiments, and deliver reliable model upgrades to customers.
Must-Have (technical skills)
- Python
- PyTorch
- Hugging Face Transformers
- LangChain
- FAISS
- Docker
- Kubernetes
- AWS
- 5+ years of hands-on ML/NLP engineering experience with demonstrable LLM projects or production deployments.
- Proven experience delivering end-to-end model solutions in cloud environments and working with cross-functional teams.
- Experience with RAG architectures, vector DBs (Milvus, Pinecone), or production embedding pipelines.
- Familiarity with MLflow/TFX, model quantization libraries, or on-device inference optimisations.
- Background in prompt engineering, safety/mitigation techniques, and benchmarking LLMs on task-specific metrics.
- Remote-first, flexible work model with India-wide hiring and cross-functional teams.
- Opportunity to work on cutting-edge LLM products and shape ML architecture end-to-end.
- Mentorship culture, regular tech talks, and focus on measurable impact and career growth.
Skills: nlp,llm,ml