Principal Data Engineer - Spark Expertise

Orbis Group β€’ United State
Remote
Apply
AI Summary

We are seeking a Principal Data Engineer with deep Spark expertise to architect and scale the data backbone behind cutting-edge AI-driven systems. The successful candidate will design and evolve distributed, cloud-based data infrastructure and build high-performance data pipelines. This is a remote opportunity with a VC-backed conversational AI scale-up.

Key Highlights
Design and evolve distributed, cloud-based data infrastructure
Build high-performance data pipelines for analytics, AI/ML workloads, and integrations
Champion data reliability, quality, and observability across pipelines
Technical Skills Required
Python Spark PySpark SQL NoSQL databases (PostgreSQL, DynamoDB, Cassandra) Distributed computing Modern data modeling
Benefits & Perks
Remote work
Great equity
Opportunity to join a high-growth company

Job Description


Principal Data Engineer

πŸ“ Remote – USA


A VC-backed conversational AI scale-up is expanding its engineering team and is looking for a Principal Data Engineer with deep Spark expertise to help architect and scale the data backbone behind cutting-edge AI-driven systems.


What You’ll Do


  • Design and evolve distributed, cloud-based data infrastructure that supports both real-time and batch processing at scale.
  • Build high-performance data pipelines that power analytics, AI/ML workloads, and integrations with third-party platforms.
  • Champion data reliability, quality, and observability, introducing automation and monitoring across pipelines.
  • Collaborate closely with engineering, product, and AI teams to deliver data solutions for business-critical initiatives.


What We’re Looking For


  • 5+ years in software development and data engineering with ownership of production-grade systems.
  • Proven expertise in Spark/PySpark
  • Strong knowledge of distributed computing and modern data modeling approaches.
  • Solid programming skills in Python, with an emphasis on clean, maintainable code.
  • Hands-on experience with SQL and NoSQL databases (e.g., PostgreSQL, DynamoDB, Cassandra).
  • Excellent communicator who can influence and partner across teams.


Bonus Points


  • Experience in high-growth, early-stage environments.
  • Familiarity with MLOps and deploying ML models into production data workflows.
  • A problem-solver at heart, excited by innovation and complex challenges.


Fully remote, great equity and the chance to join a rocketship available here. If you'd like to find out more, don't hesitate to apply!


Subscribe our newsletter

New Things Will Always Update Regularly