Senior Cloud ML Engineer for GenAI and MLOps

Brooksource United State
Remote
This Job is No Longer Active This position is no longer accepting applications
AI Summary

Implement and optimize enterprise-grade GenAI and MLOps infrastructure using AWS and Azure. Develop automated logging and observability tools. Collaborate with cross-functional teams to deliver cloud AI solutions.

Key Highlights
Implement and optimize GenAI and MLOps infrastructure
Develop automated logging and observability tools
Collaborate with cross-functional teams to deliver cloud AI solutions
Technical Skills Required
AWS Azure Databricks Delta Lake Unity Catalog Terraform CloudFormation SageMaker GitLab CI GitHub Actions CodePipeline Python Java C++
Benefits & Perks
100% remote work
W-2 with benefits
Contract-to-hire

Job Description


Senior Cloud ML Engineer – GenAI & MLOps

Contract-to-Hire (W-2 with Benefits)

100% Remote (CST Work Hours)


Our Fortune 50 healthcare client is seeking a Senior Cloud ML Engineer to implement and optimize enterprise-grade GenAI and MLOps infrastructure. This role is part of a new AI Shared Services team and works directly with the Lead Cloud ML Engineer to build secure, scalable access to cloud AI/ML services from AWS and Azure. You will focus on hands-on development, integration, and deployment of AI platform components that enable standardized MLOps/LLMOps capabilities across the organization.


Responsibilities:

  • Build and configure components of the AI Shared Services platform supporting cloud AI/ML/GenAI services from AWS and Azure.
  • Implement features for the AI Gateway to standardize MLOps/LLMOps frameworks, centralize model access, enforce governance, and enable usage tracking and cost optimization.
  • Develop automated logging of model inputs/outputs to Databricks Delta tables via Unity Catalog for observability and compliance.
  • Apply guardrails for PII protection, prompt injection defense, harmful content filtering, and rate limiting to ensure security, compliance, and cost control.
  • Configure observability, logging, and monitoring tools to ensure reliability and auditability for ML, LLM, and GenAI workloads.
  • Deploy and manage cloud-native inference infrastructure using AWS SageMaker, including containerized services that provide ML/LLM models through scalable APIs.
  • Integrate AI infrastructure components into CI/CD pipelines (GitLab CI, GitHub Actions, CodePipeline) for automated deployments.
  • Collaborate with Cloud Platform Engineering, Data Engineering, Data Science, and Security teams to deliver cloud AI solutions aligned with enterprise standards.
  • Contribute to technical documentation and execution of best practices for MLOps and LLMOps.


Requirements:

  • Bachelor’s degree in Computer Science or Data Science required; Master’s preferred.
  • 10+ years of experience in cloud platform engineering and ML engineering.
  • Hands-on experience implementing MLOps/LLMOps frameworks and integrating AI services into enterprise cloud infrastructure environments.
  • Experience provisioning, configuring, and integrating cloud AI/ML services into an enterprise, specifically AWS Bedrock or Azure AI Foundry.
  • Experience building cloud-native AI/ML services, including SageMaker pipelines and inference endpoints, and integrating them into CI/CD pipelines.
  • Experience deploying and scaling LLM and GenAI workloads as cloud infrastructure components within MLOps pipelines.
  • Experience building and deploying Infrastructure as Code using Terraform or CloudFormation.
  • Experience with Databricks, Delta Lake, and Unity Catalog for data governance and observability.
  • Experience implementing security and compliance controls to protect PII and regulated data within AI/ML workloads.

Subscribe our newsletter

New Things Will Always Update Regularly