DevOps Engineer - CI/CD and Monitoring Systems

Bjak China
Remote
Apply
AI Summary

Join BJAK as a DevOps Engineer to build and maintain reliable CI/CD pipelines and monitoring systems for an AI automation platform powering end-to-end insurance journeys. You will design deployment automation, implement safe release strategies, and enhance system observability to reduce production risk and improve operational confidence. This fully remote role requires strong experience in DevOps, CI/CD, and production monitoring to support business-critical insurance workflows across Southeast Asia.

Key Highlights
Fully remote position based in China collaborating with Malaysia-based teams
Build and maintain CI/CD pipelines and monitoring systems for AI automation platform
Focus on deployment safety, system observability, and operational reliability at scale
Key Responsibilities
Design and maintain CI/CD pipelines for multiple services across the platform
Improve deployment automation, release strategies and rollback mechanisms
Build and enhance monitoring, alerting and observability systems across production services
Ensure system health visibility through metrics, logs, traces and dashboards
Work with engineers to reduce deployment risk and improve release confidence
Implement safe deployment strategies such as canary, blue-green or phased rollouts
Improve incident detection speed and reduce mean time to recovery (MTTR)
Support infrastructure reliability for business-critical insurance workflows
Standardize deployment and monitoring practices across engineering teams
Continuously improve CI/CD performance, stability and maintainability
Technical Skills Required
CI/CD pipelines Monitoring and observability systems System reliability and operational risk management
Benefits & Perks
Fully remote work arrangement
Learning & Development Budget for continuous technical growth
Competitive compensation package based on experience and impact
Nice to Have
Experience with Jenkins, GitHub Actions, GitLab CI or similar CI/CD tools
Experience with Kubernetes, Docker or container-based deployments
Experience with observability stacks (Prometheus, Grafana, ELK, Datadog, etc.)
Experience with infrastructure-as-code tools (Terraform, Ansible, etc.)
Experience with zero-downtime deployments and progressive delivery strategies
Experience with cloud platforms (AWS, GCP, Azure)
Experience in fintech, insurance or other high-availability industries
Experience improving deployment velocity and reliability at scale
Contributions to CI/CD or monitoring system improvements

Job Description


BJAK’s automation systems power end-to-end insurance journeys across quote generation, policy issuance, renewals, endorsements, claims, payments and insurer integrations. These systems are business-critical, where deployment stability, monitoring and fast recovery directly impact customers and operations.

We're looking for a DevOps Engineer based in China to strengthen CI/CD systems, monitoring infrastructure and production visibility across BJAK’s AI automation platform, ensuring engineers can ship safely and systems remain highly observable and reliable.

This is a fully remote position where you'll collaborate closely with our Malaysia-based engineering, product and operations teams to improve deployment safety and system observability at scale.

The Mission

Build and maintain reliable CI/CD pipelines and monitoring systems that enable fast, safe and observable deployments across BJAK’s AI automation platform, reducing production risk while improving system visibility and operational confidence.

What You’ll Own

  • Design and maintain CI/CD pipelines for multiple services across the platform.
  • Improve deployment automation, release strategies and rollback mechanisms.
  • Build and enhance monitoring, alerting and observability systems across production services.
  • Ensure system health visibility through metrics, logs, traces and dashboards.
  • Work with engineers to reduce deployment risk and improve release confidence.
  • Implement safe deployment strategies such as canary, blue-green or phased rollouts.
  • Improve incident detection speed and reduce mean time to recovery (MTTR).
  • Support infrastructure reliability for business-critical insurance workflows.
  • Standardize deployment and monitoring practices across engineering teams.
  • Continuously improve CI/CD performance, stability and maintainability.

What We're Looking For

  • Experience in DevOps, SRE, platform engineering or infrastructure roles.
  • Strong understanding of CI/CD pipelines, deployment automation and release engineering.
  • Experience with monitoring, logging and observability systems in production environments.
  • Ability to troubleshoot deployment and production issues in a structured and calm manner.
  • Strong understanding of system reliability, uptime and operational risk.
  • Experience supporting production systems with high availability requirements.
  • Hands-on ownership mindset during incidents and deployment failures.
  • Practical judgment on release safety, performance and system stability.
  • Strong collaboration with engineering teams in fast-paced environments.
  • Low ego and disciplined approach to production operations.

Bonus Points

  • Experience with Jenkins, GitHub Actions, GitLab CI or similar CI/CD tools.
  • Experience with Kubernetes, Docker or container-based deployments.
  • Experience with observability stacks (Prometheus, Grafana, ELK, Datadog, etc.).
  • Experience with infrastructure-as-code tools (Terraform, Ansible, etc.).
  • Experience with zero-downtime deployments and progressive delivery strategies.
  • Experience with cloud platforms (AWS, GCP, Azure).
  • Experience in fintech, insurance or other high-availability industries.
  • Experience improving deployment velocity and reliability at scale.
  • Contributions to CI/CD or monitoring system improvements.

The Kind of Builder We Want

  • Thinks in deployment safety, system visibility and operational reliability.
  • Hands-on engineer who understands both pipelines and production systems deeply.
  • Calm and structured when handling deployment failures or production incidents.
  • Strong focus on observability, automation and release confidence.
  • Proactive in preventing issues rather than reacting to them.
  • Careful and deliberate when making production changes.
  • Builds systems engineers trust to deploy frequently and safely.

This Role Is Not For

  • Engineers who only react to deployment failures instead of preventing them.
  • People who are careless with production pipelines or release processes.
  • Individuals who ignore monitoring, alerting or system visibility.
  • Engineers who make risky deployment changes without proper safeguards.
  • Candidates who cannot stay calm during incidents or deployment failures.

Success in This Role

You'll be successful if you can:

  • Improve deployment safety, speed and reliability across all services.
  • Strengthen monitoring, alerting and system observability coverage.
  • Reduce production incidents caused by releases or configuration changes.
  • Improve MTTR through better visibility and incident tooling.
  • Enable engineers to ship with confidence and minimal operational risk.

Why Join BJAK

  • Build Reliable Delivery Systems – Own CI/CD and monitoring for AI automation platforms.
  • High-Impact Engineering – Solve real-world release engineering and observability challenges.
  • Global Engineering Team – Work with experienced engineers across multiple countries.
  • Fully Remote – Work remotely from China while collaborating with our Malaysia-based teams.
  • International Exposure – Build systems used across Southeast Asia markets.
  • Learning & Development Budget – Support continuous technical growth and DevOps expertise.
  • High Ownership Environment – Strong autonomy over deployment and monitoring architecture.
  • Modern Engineering Culture – Focus on reliability, speed and engineering excellence.
  • Competitive Compensation – Attractive salary package based on experience and impact.

Interview Process

We assess DevOps depth, CI/CD design thinking and production reliability experience. The process usually includes application review, two interviews and a technical scenario or systems discussion.


Similar Jobs

Explore other opportunities that match your interests

Generative AI Engineer

Devops
29m ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Entry level

brillq

Poland

Senior Site Reliability Engineer, Release

Devops
1h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

Alkami Technology

United State
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Not Applicable

tandem dating

United Kingdom

Subscribe our newsletter

New Things Will Always Update Regularly