Senior Data Engineer

Caseware Colombia
Remote
Apply
AI Summary

Design, build, and operate reliable ETL/ELT and ingestion pipelines. Improve end-to-end data-lake foundations and build event-driven data flows. Partner across Platform/AI/DevOps/Product to lead pragmatic modernization.

Key Highlights
Design and operate ETL/ELT pipelines
Improve data-lake foundations
Build event-driven data flows
Key Responsibilities
Design, build, and operate reliable ETL/ELT and ingestion pipelines
Improve end-to-end data-lake foundations
Build event-driven data flows
Partner across Platform/AI/DevOps/Product to lead pragmatic modernization
Technical Skills Required
JVM services AWS EKS Lambda S3 DynamoDB SNS/SQS Lake Formation Glue Catalog OpenSearch Serverless GitHub/GitHub Actions Nx monorepo Jira/Confluence Spark EMR Glue
Benefits & Perks
Competitive compensation
100% remote work environment
Prepaid medicine
Life insurance
Funeral assistance
Internet allowance
Home office stipend
5 Personal Time Off days per year
Sick Leave Top up to total 100% of salary paid by the employer from Day 3 to 90
Recognition Award
Additional paid time off in recognition of the corresponding year of service
Upgrade vacation starting at 5 years of service
Nice to Have
JVM-first data processing experience (Java/Kotlin/Scala) with Spark-based workloads
Experience with schema evolution and data contracts (versioning strategies, backfills, compatibility)

Job Description


What you will be doing:

  • Design, build, and operate reliable ETL/ELT and ingestion pipelines moving data from transactional systems into analytics/AI-ready platforms.
  • Improve end-to-end data-lake foundations: storage layout, partitioning, schema evolution/versioning, lineage, cataloging, and delta synchronization.
  • Build and operate event-driven data flows that power real-time integrations and AI agent orchestration.
  • Help scale retrieval workflows (vector storage/indexing, embedding pipelines, RAG-adjacent data flows) that support production-grade AI capabilities.
  • Strengthen reliability across services and pipelines: retries, backoff, DLQs, idempotency, reconciliation, and operational observability.
  • Lead pragmatic modernization: reduce accidental coupling between business logic and infrastructure, improve contracts, and make systems easier to run locally and operate in production.
  • Partner across Platform/AI/DevOps/Product; lead proof-of-concepts and translate results into durable platform capabilities.
  • Participate in an on-call rotation and drive post-incident improvements (post-mortems, root cause analysis, and prevention).


You’re a great fit if you sound like one of these profiles

Profile A (most common): Backend/platform engineer who is strong in JVM distributed systems and has shipped real data workflows (Spark/EMR/Glue exposure).

Profile B: Data engineer who has built pipelines and is comfortable owning services, async messaging semantics, and production operations—not only transformations.


What you’ll bring:

  • Strong software engineering fundamentals: designing maintainable, testable systems and owning features end-to-end.
  • Production experience with distributed systems: async workflows, failure modes, retries, and eventual consistency.
  • Hands-on experience building and owning ETL/ELT pipelines, including ingestion from OLTP sources into a data lake.
  • Experience operating data systems in production: monitoring, incident response, and continuous improvement.
  • Cloud experience on AWS building production systems (not just using services): storage + messaging + orchestration.
  • Strong collaboration and communication; ability to mentor and raise engineering maturity through reviews and design discussions.
  • Strong English language communication and collaboration skills


Strongly preferred (high-signal)


  • JVM-first data processing experience (Java/Kotlin/Scala) with Spark-based workloads.
  • Experience with schema evolution and data contracts (versioning strategies, backfills, compatibility).
  • Operational ownership of pipeline reliability: replay safety, DLQ patterns, reconciliation, lineage thinking.
  • IaC experience (CDK preferred; CloudFormation/Terraform acceptable).


The Tech Stack You’ll Work With:

  • JVM services (Java 21+ / Spring microservices) and some Python.
  • AWS: EKS, Lambda; storage/messaging/catalog primitives (S3, DynamoDB, SNS/SQS, Lake Formation, Glue Catalog).
  • Search/retrieval: OpenSearch Serverless and related vector storage/retrieval components.
  • Tooling: GitHub/GitHub Actions, Nx monorepo, Jira/Confluence.


Why this role exists

Caseware is evolving Caseware Cloud to deliver intelligent, data-driven experiences—powering analytics, automation, and AI/agentic capabilities on top of a modern data platform.

This role is for someone who can bridge transactional backend systems and data-intensive distributed workflows. You’ll work on systems that combine:

  • APIs and domain services (microservices, relational modeling, service boundaries)
  • Asynchronous workflows (messaging, retries, idempotency, replay safety)
  • Distributed/batch data processing (Spark-based processing and lake patterns)
  • Cloud platform primitives (AWS orchestration and managed services)
  • AI-ready retrieval workflows (embedding + vector retrieval pipelines)


What success looks like (first 6–12 months)

  • Improved reliability and operability of ingestion + async workflows (clearer idempotency/replay patterns, fewer recurring incidents).
  • Cleaner boundaries between orchestration/control-plane concerns and data-processing execution concerns.
  • Better observability across APIs, queues, workflows, and distributed jobs.
  • Clearer data contracts and more predictable schema evolution practices.
  • Tangible improvements in developer experience (local run, testing, reduced “environment-only” hacks).


Perks & Benefits

  • ¨Contrato a termino Indefinido¨ with all the legal benefits
  • Prepaid Medicine
  • Life insurance and funeral assistance
  • Internet allowance
  • Home office stipend
  • Competitive compensation — above the market average
  • 100% remote work environment and an excellent work-life balance
  • Opportunity to work for a growing global SaaS leader company
  • A culture that promotes independence, innovation, trust, and accountability
  • Open space to be creative, innovative and strategize for the future
  • Mentorship by highly experienced professional
  • Budget for training, we want you to grow
  • 5 Personal Time Off days per year
  • Sick Leave Top up to total 100% of salary paid by the employer from Day 3 to 90.
  • Recognition Award, additional paid time off in recognition of the corresponding year of service
  • Upgrade vacation starting at 5 years of service


Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Associate

emma of torre.ai

Colombia

Support Engineer

Programming
1w ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

ariesline msp

Colombia

Senior Backend Developer

Programming
1w ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Capitole

Colombia

Subscribe our newsletter

New Things Will Always Update Regularly