AI / LLM Deployment Engineer

Walker Lovell • United Arab Emirates
Remote Visa Sponsorship
Apply
AI Summary

Lead the deployment of large language models including DeepSeek, Kimi, Qwen and LLaMA into sovereign, air-gapped environments. Architect and optimise NVIDIA GPU clusters with quantisation techniques for high-performance inference. Requires proven experience with vLLM, TGI, Ollama and Kubernetes in highly regulated settings.

Key Highlights
Deployment of large language models into air-gapped environments
NVIDIA H100/H200 GPU cluster configuration and optimisation
GPTQ, AWQ and GGUF quantisation techniques for inference efficiency
Kubernetes deployment of vLLM, TGI and Ollama runtimes
Key Responsibilities
Architect and deploy LLMs including DeepSeek, Kimi, Qwen and LLaMA into secure, air-gapped production environments
Configure and optimise NVIDIA H100/H200 GPU clusters, NVLink and InfiniBand infrastructure for high-performance inference
Apply GPTQ, AWQ and GGUF quantisation techniques to maximise deployment efficiency without compromising model performance
Deploy and optimise inference runtimes including vLLM, TGI and Ollama within Kubernetes environments, delivering target throughput and latency SLAs
Technical Skills Required
Kubernetes NVIDIA GPU infrastructure Model quantisation
Benefits & Perks
Exceptional package reflecting seniority and specialist expertise
Fully remote initially with flexibility for future relocation
Visa sponsorship available where applicable
Nice to Have
Experience within government, defence, cyber security or other highly regulated environments

Job Description


AI / LLM Deployment Engineer

Location: Remote (GST time zone preferred) with occasional travel to Abu Dhabi if required

Travel: Occasional international travel

Compensation: Exceptional package reflecting seniority, technical expertise and impact


What's in it for you?

This isn't another AI application role. You'll lead the deployment of large language models including DeepSeek, Kimi and Qwen into sovereign, air-gapped environments where GPU performance, inference optimisation and security are business critical. If you're passionate about high-performance AI infrastructure, this is an opportunity to solve problems that very few engineers get to tackle.


Package / Benefits

  • Exceptional package reflecting seniority and specialist expertise
  • Fully remote initially with flexibility for future relocation if desired
  • Visa sponsorship available where applicable
  • Work with cutting-edge open-weight LLMs and enterprise GPU infrastructure
  • Influence deployment architecture from the ground up


Why this business

Join a globally focused technology business developing sovereign AI and intelligence platforms for highly regulated environments across multiple international markets. Working at the forefront of secure AI deployment, the organisation is investing heavily in advanced infrastructure and offers the opportunity to solve technically demanding challenges alongside a highly experienced engineering team.


What you'll be doing

  • Architect and deploy LLMs including DeepSeek, Kimi, Qwen and LLaMA into secure, air-gapped production environments
  • Configure and optimise NVIDIA H100/H200 GPU clusters, NVLink and InfiniBand infrastructure for high-performance inference
  • Apply GPTQ, AWQ and GGUF quantisation techniques to maximise deployment efficiency without compromising model performance
  • Deploy and optimise inference runtimes including vLLM, TGI and Ollama within Kubernetes environments, delivering target throughput and latency SLAs


What you'll bring

  • Proven commercial experience deploying production LLMs using vLLM, TGI, Ollama or equivalent inference platforms
  • Expert knowledge of Kubernetes, NVIDIA GPU infrastructure, GPU memory optimisation and high-performance computing
  • Hands-on experience with model quantisation techniques including GPTQ, AWQ or GGUF
  • Experience delivering on-premise or air-gapped AI deployments. Experience within government, defence, cyber security or other highly regulated environments would be advantageous.


Who this suits


You're an infrastructure engineer who thrives on solving complex deployment challenges rather than building AI applications. You understand what it takes to run large language models reliably at scale, enjoy optimising GPU performance and want to work on technically demanding projects where security, performance and engineering excellence are non-negotiable.


Apply now for a confidential conversation with Walker Lovell.


Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Walker Lovell

United Arab Emirates
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

UMATR

United Arab Emirates

Senior AI-Driven Software Engineer - AI-Native Development

Programming
•
8h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

whiteshield

United Arab Emirates

Subscribe our newsletter

New Things Will Always Update Regularly