Senior Machine Learning Engineer - NLP & LLM

Orbion Infotech • India
Remote
Apply
AI Summary

Orbion Infotech is seeking a Senior Machine Learning Engineer to design, fine-tune, and deploy LLM-based NLP models and pipelines. The ideal candidate will have 5+ years of hands-on ML/NLP engineering experience and a proven track record of delivering end-to-end model solutions in cloud environments.

Key Highlights
Design, fine-tune, and deploy LLM-based NLP models and pipelines
Develop and productionize inference services with low-latency vector search integrations
Implement end-to-end MLOps for LLMs: containerised model serving, CI/CD for models, automated retraining, monitoring, and observability
Technical Skills Required
Python PyTorch Hugging Face Transformers LangChain FAISS Docker Kubernetes AWS
Benefits & Perks
Remote-first, flexible work model
Opportunity to work on cutting-edge LLM products and shape ML architecture end-to-end
Mentorship culture, regular tech talks, and focus on measurable impact and career growth

Job Description


Senior Machine Learning Engineer - NLP & LLM

About The Opportunity

We operate in the Enterprise AI and NLP sector, building production-grade Large Language Model (LLM) solutions and intelligent language services for B2B customers across search, automation, and analytics. The team focuses on deploying reliable, scalable LLM-powered systems that deliver retrieval-augmented generation (RAG), semantic search, and conversational AI in cloud-native environments. This is a fully remote role based in India.

Role & Responsibilities

  • Design, fine-tune, and deploy LLM-based NLP models and pipelines (training âžœ evaluation âžœ inference) to power production features such as RAG, question answering, and summarization.
  • Develop and productionize inference services with low-latency vector search integrations (embeddings âžœ FAISS/Pinecone/Milvus) and document retrieval layers.
  • Implement end-to-end MLOps for LLMs: containerised model serving, CI/CD for models, automated retraining, monitoring, and observability of drift and performance.
  • Integrate LLM toolchains (Hugging Face, LangChain) into backend APIs and orchestrate multi-step agent workflows for task automation and conversational agents.
  • Optimize model cost and performance: quantization, pruning, batching, and autoscaling for GPU/CPU inference in cloud environments.
  • Collaborate with product, data engineering, and QA to define success metrics, run A/B experiments, and deliver reliable model upgrades to customers.

Skills & Qualifications

Must-Have (technical skills)

  • Python
  • PyTorch
  • Hugging Face Transformers
  • LangChain
  • FAISS
  • Docker
  • Kubernetes
  • AWS

Minimum Qualifications

  • 5+ years of hands-on ML/NLP engineering experience with demonstrable LLM projects or production deployments.
  • Proven experience delivering end-to-end model solutions in cloud environments and working with cross-functional teams.

Preferred

  • Experience with RAG architectures, vector DBs (Milvus, Pinecone), or production embedding pipelines.
  • Familiarity with MLflow/TFX, model quantization libraries, or on-device inference optimisations.
  • Background in prompt engineering, safety/mitigation techniques, and benchmarking LLMs on task-specific metrics.

Benefits & Culture Highlights

  • Remote-first, flexible work model with India-wide hiring and cross-functional teams.
  • Opportunity to work on cutting-edge LLM products and shape ML architecture end-to-end.
  • Mentorship culture, regular tech talks, and focus on measurable impact and career growth.

How to apply: Apply with your resume and a short note highlighting relevant LLM/NLP projects or links to repos/demos. Candidates with concrete examples of production LLM deployments or benchmarks will be prioritised.

Skills: nlp,llm,ml

Subscribe our newsletter

New Things Will Always Update Regularly