Senior MLOps/DevOps Engineer - AI Super App

Remote
Apply
AI Summary

Build, secure, and scale AI infrastructure. Partner with AI engineers to manage GPU clusters, optimize vLLM inference, and architect data pipelines. 5+ years experience required.

Key Highlights
Inference Orchestration
GPU Cluster Management
Data & Storage Infrastructure
Key Responsibilities
Deploy, auto-scale, and monitor high-throughput vLLM endpoints inside Kubernetes
Provision, partition, and maintain multi-node GPU compute configurations for large-scale training and quantization pipelines
Architect highly scalable storage solutions and data pipelines to ingest, store, and preprocess massive multilingual datasets
Build automated pipelines that handle the entire model lifecycle
Technical Skills Required
Kubernetes Docker GPU Virtualization
Benefits & Perks
Remote (with site visits and possible relocation to South Korea)
Compensation: As per experience and industry standards
Nice to Have
Python and Bash scripting skills
Experience with data pipeline orchestrators like Apache Airflow or Prefect
Background in monitoring large AI systems using Prometheus, Grafana, and OpenTelemetry

Job Description


Company Description Konnect Co. Ltd. is building an AI-powered Super App that transforms how international visitors and residents experience Korea by removing barriers related to language, payment, and identity. The platform is designed for tourists, international students, and digital nomads who often struggle with local apps that require Korean phone numbers, national IDs, or bank accounts. Using advanced multilingual AI, Konnect converts global intent into seamless local actions, enabling users to book, pay, and connect without switching apps or needing local credentials. Through eSIM-based digital identity and integrated payment solutions, Konnect eliminates the need for Korean numbers or bank accounts, making everyday tasks more accessible. The company’s mission is to serve as a digital bridge that makes Korea more welcoming, efficient, and inclusive for the global community.


(https://koreatechdesk.com/k-startup-grand-challenge-ksgc-comeup-2025-winner-indias-konnect)


Job Role: We are seeking a Senior MLOps/DevOps Engineer to build, secure, and scale the infrastructure powering our core AI pipelines. You will partner directly with our Senior AI Engineers to support data acquisition workflows, manage large-scale multi-GPU training environments, and orchestrate high-throughput vLLM inference clusters. Your role is critical in transforming experimental neural network models into highly available, cost-effective production services.


Work Structure


Onsite: Remote (with site visits and possible relocation to South Korea)


Compensation: As per experience and industry standards


Key Responsibilities


•⁠ ⁠Inference Orchestration: Deploy, auto-scale, and monitor high-throughput vLLM endpoints inside Kubernetes, optimizing resource allocation for PagedAttention and continuous batching. 


•⁠ ⁠GPU Cluster Management: Provision, partition, and maintain multi-node GPU compute configurations (e.g., Nvidia A100/H100 instances) for large-scale training and quantization pipelines. 


•⁠ ⁠Data & Storage Infrastructure: Architect highly scalable storage solutions (e.g., MinIO, AWS S3) and data pipelines to ingest, store, and preprocess massive multilingual datasets. 


•⁠ ⁠CI/CD for AI: Build automated pipelines that handle the entire model lifecycle—from raw dataset versioning, tracking weights, automating post-training quantization (AWQ/GPTQ), to image registry promotion.


•⁠ ⁠Vector DB Management: Maintain and scale high-availability vector databases (e.g., Qdrant, Milvus, or Pinecone) ensuring fast read/write times for compressed embedding dimensions. 


Technical Requirements


•⁠ ⁠Containerization & Orchestration: Expert-level Kubernetes (K8s) and Docker, with specific experience running KServe, Ray, or Kubeflow for AI workloads. 


•⁠ ⁠GPU Virtualization: Deep knowledge of the Nvidia Container Toolkit, CUDA driver updates, and GPU sharing/slicing techniques (MIG).


•⁠ ⁠Model Serving Engines: Practical experience deploying and profiling vLLM, Triton Inference Server, or TensorRT-LLM in a production setting.


•⁠ ⁠Infrastructure as Code (IaC): Mastery of Terraform or CloudFormation for provisioning multi-cloud cloud resources (AWS, GCP, Azure, or RunPod/CoreWeave). 


•⁠ ⁠ML Lifecycle Tools: Hands-on experience with model and data versioning frameworks such as MLflow, Weights & Biases (W&B), or DVC.


Preferred Qualifications


•⁠ ⁠Solid scripting skills in Python and Bash to automate model evaluation and quantization scripts.

•⁠ ⁠Background in monitoring large AI systems using Prometheus, Grafana, and OpenTelemetry to track GPU usage and token metrics (TTFT, ITL).

•⁠ ⁠Experience with data pipeline orchestrators like Apache Airflow or Prefect.


Experience & Education


•⁠ ⁠Bachelor’s degree in Computer Science, DevOps Engineering, Systems Architecture, or a related field.

•⁠ ⁠5+ years of experience as a DevOps or MLOps Engineer, with at least 1–2 years directly supporting GPU-intensive deep learning or LLM production workloads.


Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Miratech

India
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Not Applicable

Jobgether

India

Senior DevOps/SRE Engineer

Devops
9h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

BairesDev

India

Subscribe our newsletter

New Things Will Always Update Regularly