We are seeking a Senior DevOps Engineer to join our team responsible for running and operating a large-scale AI platform. The ideal candidate will have experience with Kubernetes, distributed systems, and microservice architectures. This is a challenging role that requires strong troubleshooting and analytical skills.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Job Description
DevOps Engineer — Kubernetes / Distributed Systems / AI Platform
Remote anywhere in Germany | HQ in NRW | Work from anywhere for up to 180 days per year
This is not a “keep the lights on” DevOps role...
You’ll be part of the team responsible for running and operating a large-scale AI platform used in complex customer environments — including highly customised on-premise infrastructure deployments.
The challenge here isn’t just Kubernetes.
It’s making a highly distributed, containerised system reliably run in environments you don’t fully control.
That means troubleshooting under pressure, improving deployment processes, working directly with customer-side infrastructure teams, and owning the operational reality of a production AI platform end-to-end.
Interested in remote work opportunities in Devops? Discover Devops Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
You’ll be working on systems running more than 1,000 containers in production across a large microservice architecture, helping improve everything from CI/CD pipelines and observability to runtime stability and deployment reliability.
This role is heavily focused on runtime operations, incident handling, and delivery infrastructure — not feature development.
The Engineering Muscle You Bring
- Experience with Kubernetes or OpenShift
- Strong understanding of distributed systems and microservice architectures
- Experience with CI/CD tooling such as Jenkins, GitHub Actions, Ansible, or ArgoCD
- Experience with monitoring and observability tooling such as Grafana, Loki, Prometheus, OpenTelemetry, Dynatrace, or Instana
- Knowledge of technologies like Redis, Postgres, MariaDB, Kafka, Elastic, or Minio
- Strong troubleshooting and analytical skills
- Hands-on engineering mindset with a strong sense of ownership
- Fluent German and English communication skills
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
Why This Role Appeals to People Who Like Complexity
- Large-scale production systems with 1,000+ containers running live
- Complex Kubernetes and OpenShift environments
- Real operational ownership instead of pure maintenance work
- Challenging on-premise customer deployments
- Exposure to modern AI platforms and distributed architectures
- High-impact work with lots of technical depth and learning potential
Khalifa@thryvetalent.com
Similar Jobs
Explore other opportunities that match your interests
THRYVE
huskycare