AI Platform Engineer (LLM Infrastructure)

Akvelon, Inc. United State
Remote
Apply
AI Summary

Akvelon is seeking an AI Platform Engineer to build and operate an internal AI platform for efficient AI-powered service delivery. Key responsibilities include managing Kubernetes infrastructure, supporting LLM workflows, and integrating AI tooling. Requires strong Kubernetes, Terraform, Python, and MLOps experience, with a focus on improving DevEx and reducing time-to-market.

Key Highlights
Build and operate an internal AI platform for LLM-based services.
Focus on improving Developer Experience (DevEx) and reducing time-to-market.
Requires strong Kubernetes, Terraform, and Python skills with MLOps context.
Key Responsibilities
Build and operate the AI platform infrastructure enabling developers to ship LLM-based services faster.
Implement and maintain Kubernetes-based runtime environments (incl. AKS) for AI workloads.
Manage infrastructure as code with Terraform (modules, environments, CI/CD automation).
Support LLM workflows: RAG, agents, prompt experimentation, evaluations, and deployment patterns.
Integrate and operate tooling such as Azure AI Foundry, LiteLLM, Langfuse, MLflow.
Orchestrate pipelines using Kubeflow Pipelines and/or Argo Workflows (build, deploy, evaluate).
Improve platform reliability and observability (monitoring, logging, tracing, cost/perf signals).
Collaborate closely with developers to streamline DX (APIs, templates, docs, golden paths, automation).
Technical Skills Required
Kubernetes AKS Terraform Python CI/CD RAG Kubeflow Pipelines Argo Workflows Azure AI Foundry LiteLLM Langfuse MLflow
Benefits & Perks
relocation support available
Nice to Have
Experience building internal developer platforms or “paved roads” for engineering teams.
Familiarity with LLM evaluation frameworks, prompt testing workflows, and LLM observability.
Exposure to RAG architectures, vector databases, and agentic patterns.
Experience with Kubeflow, Argo, and ML lifecycle tooling.

Job Description


This engagement is focused on building an internal AI platform that enables developers to ship AI-powered services efficiently. Scope includes model connectivity, prompt testing and evaluation, monitoring/observability, and the underlying AI infrastructure layer.


The objective is to improve DevEx and reduce time-to-market for AI features.


Location: Serbia (relocation support available), Croatia, Poland, Portugal


Tasks

  • Build and operate the AI platform infrastructure enabling developers to ship LLM-based services faster.

  • Implement and maintain Kubernetes-based runtime environments (incl. AKS) for AI workloads.

  • Manage infrastructure as code with Terraform (modules, environments, CI/CD automation).

  • Support LLM workflows: RAG, agents, prompt experimentation, evaluations, and deployment patterns.

  • Integrate and operate tooling such as Azure AI Foundry, LiteLLM, Langfuse, MLflow.

  • Orchestrate pipelines using Kubeflow Pipelines and/or Argo Workflows (build, deploy, evaluate).

  • Improve platform reliability and observability (monitoring, logging, tracing, cost/perf signals).

  • Collaborate closely with developers to streamline DX (APIs, templates, docs, golden paths, automation).


Requirements

  • Strong hands-on experience with Kubernetes in production (preferably AKS).

  • Solid Terraform expertise (IaC best practices, multi-env setups).

  • Practical experience supporting ML/LLM workloads in a platform or DevOps/MLOps context.

  • Proficiency in Python for automation, scripting, and supporting APIs/evaluation tooling.

  • Understanding of CI/CD, release processes, and production-grade operations.

  • Ability to work under tight timelines and deliver pragmatically.


Nice to Have



  • Experience building internal developer platforms or “paved roads” for engineering teams.

  • Familiarity with LLM evaluation frameworks, prompt testing workflows, and LLM observability.

  • Exposure to RAG architectures, vector databases, and agentic patterns.

  • Experience with Kubeflow, Argo, and ML lifecycle tooling.


Engagement Type



  • Long-term B2B contract.


Team



  • You will join a team of 5, with 3 AI Platform Engineers being added.


Location / Timezone



  • Remote work from Croatia, Poland, Portugal, and Serbia.

  • European working hours.

  • Occasionally available for meetings up to 10:00 AM PST (US overlap).

Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Mid-Senior level

the brixton group

United State
Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Mid-Senior level

Brooksource

United State

AI Infrastructure Engineer

Devops
15h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Utilidata

United State

Subscribe our newsletter

New Things Will Always Update Regularly