Senior Cloud Infrastructure Engineer

Pentasia • United State
Remote
Apply
AI Summary

Design, build, and operate cloud infrastructure on Azure to support low-latency integrations with the operator systems and multiple external exchanges. Establish Kubernetes-based container orchestration and manage cluster sizing, autoscaling, and multi-env deployments. Own observability across metrics, logging, and distributed tracing using Prometheus, Grafana, and OpenTelemetry (OTEL); drive incident response, SLOs/SLIs, and on-call engineering practices.

Key Highlights
Design and operate cloud infrastructure on Azure
Establish Kubernetes-based container orchestration
Manage cluster sizing, autoscaling, and multi-env deployments
Technical Skills Required
Cloud infrastructure Kubernetes Azure Kafka Redis Postgres Terraform/Bicep Prometheus Grafana OpenTelemetry (OTEL)

Job Description


Company Overview

Our client is a new, fully remote market maker on prediction markets, funded by a tier 1 operator to build next-generation trading systems, leveraging huge industry IP & resources from the tier 1 operator. The founding team includes true industry veterans with a vision for the future. They are building a lean, fully remote, senior team focused on high-performance systems, pragmatic engineering, and rapid iteration.


Mission

Build and operate the core platform and trading infrastructure that powers this new prediction market trading system.


Why This Role

You will own end-to-end infrastructure, freeing software engineers to ship features rapidly while ensuring reliability, performance, and security. This is the first dedicated infrastructure leadership hire and a force multiplier for the entire engineering team.


What You Will Do

  • Design, build, and operate cloud infrastructure on Azure to support low-latency integrations with the operator systems and multiple external exchanges.
  • Establish Kubernetes-based container orchestration; manage cluster sizing, autoscaling, and multi-env deployments.
  • Stand up and optimize core services: Kafka (event streaming), Redis (caching), and Postgres (OLTP).
  • Implement robust CI/CD, secrets management, environment isolation, and infrastructure-as-code (Terraform/Bicep).
  • Own observability across metrics, logging, and distributed tracing using Prometheus, Grafana, and OpenTelemetry (OTEL); drive incident response, SLOs/SLIs, and on-call engineering practices.
  • Design and manage cloud networking: VNets, subnets, peering, private endpoints, DNS, firewall rules, and secure connectivity between our systems and the operator.
  • Harden security and compliance across identity, network segmentation, secrets, and OS baselines within the Azure/Windows ecosystem.
  • Drive performance engineering for market data ingestion and order routing; remove bottlenecks across the stack.
  • Partner with software engineers to shape service boundaries, data contracts, and platform primitives that accelerate delivery.
  • Manage cloud cost efficiency and capacity planning while ensuring reliability and high availability.
  • Set standards and grow an infrastructure function that scales with the company.


What You Bring

  • Senior-level infrastructure experience (8+ years) including significant time operating production systems at scale.
  • Deep cloud infrastructure expertise; the current stack runs on Azure but strong candidates from AWS or GCP backgrounds are equally welcome ...cloud fundamentals matter more than platform-specific familiarity.
  • Hands-on with Kubernetes, Kafka, Redis, and Postgres in production.
  • Expertise in networking, security, identity (e.g., Azure AD), and infra-as-code (Terraform/Bicep or similar).
  • Track record of building reliable, observable platforms with strong CI/CD and release engineering; hands-on experience with Prometheus, Grafana, and OpenTelemetry or comparable observability stacks.
  • Strong cloud networking fundamentals: VNet design, private endpoints, DNS, firewall/NSG rules, and secure cross-system connectivity on Azure.
  • Performance tuning for latency- and throughput-sensitive workloads.
  • Excellent collaboration skills; able to translate product and trading needs into platform capabilities.
  • A hands-on, curious approach to AI tooling. The team expects everyone to be actively using and exploring AI tools in their daily work whether for infra-as-code generation, log analysis, runbook automation, or incident triage. You don't need to be an AI researcher, but you should be someone who is genuinely experimenting and keeping pace with a rapidly evolving landscape.


Nice to Have

  • Experience in trading, sports betting, exchanges, financial markets, or other real-time systems.
  • Experience with event-driven architectures and stream processing.
  • Background in SRE leadership, incident command, or reliability programs.


Similar Jobs

Explore other opportunities that match your interests

Senior DevOps Engineer

Devops
•
54m ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Associate

weekday ai (yc w21)

United State

Senior Infrastructure Engineer

Devops
•
1h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Jobgether

United State
Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Mid-Senior level

crossing hurdles

United State

Subscribe our newsletter

New Things Will Always Update Regularly