Senior Data Engineer: Databricks, Delta Lake, and Lakehouse Architecture

open systems inc. • United State
Remote
Apply
AI Summary

Design, build, and operate scalable analytics and streaming solutions using Databricks, Delta Lake, and lakehouse architecture. Collaborate with stakeholders to understand business challenges and translate them into technical requirements. Develop and maintain standardized datasets while supporting ad hoc analytical needs.

Key Highlights
Design and build scalable data pipelines
Develop and maintain standardized datasets
Implement robust data quality frameworks
Key Responsibilities
Define and document data requirements
Design and build scalable data pipelines
Develop and maintain standardized datasets
Technical Skills Required
Databricks Delta Lake Delta Live Tables Apache Spark Apache Kafka AWS services SQL
Benefits & Perks
Hybrid work arrangement
100% remote option
Long-term contract
Nice to Have
Experience with Snowflake
Familiarity with NoSQL databases
Experience with enterprise messaging systems

Job Description


Title: Big Data Engineer: Various levels (Lead, Senior, Intermediate)

Location: Atlanta, GA 303038 (Hybrid: 2x/Week)

Contract: 6+ Months. Long-term.

Industry: Rail Transportation.


***If not in the Atlanta area, then 100% REMOTE.


  • Overview
  • Client Corporation is seeking a Senior Data Engineer to collaborate across the organization and deliver reliable, scalable data solutions. You will join a high-performing team focused on modern data platforms, helping design, build, and operate a Databricks-based lakehouse architecture and streaming analytics ecosystem.
  • This role combines strong business acumen with hands-on engineering expertise. You will work closely with stakeholders to understand business challenges, translate them into technical requirements, design pragmatic data architectures, and deliver production-grade data pipelines. Ownership spans the full lifecycle—from design and development through deployment and ongoing operations.
  • You will partner with data modelers, business intelligence teams, and cross-functional stakeholders to define requirements, align on scope, and ensure high-quality delivery while promoting engineering best practices and consistency.


Key Responsibilities

  • Define and document data requirements; ingest, integrate, and process large volumes of structured and semi/unstructured data.
  • Design and build scalable data pipelines for ingestion, transformation, validation, and enrichment to support downstream analytics and BI use cases.
  • Develop and maintain standardized datasets while supporting ad hoc analytical needs.
  • Implement robust data quality frameworks and continuously improve trust and reliability of datasets.
  • Contribute to data governance practices, including access control, data retention, and handling of sensitive data in alignment with enterprise policies.
  • Collaborate with data science and BI teams to deliver data models and pipelines for reporting, analytics, and machine learning.
  • Build and optimize pipelines that clean, transform, aggregate, and publish data into curated layers within the lakehouse architecture.
  • Utilize Databricks, Apache Spark, SQL, and AWS services to integrate and process data efficiently.
  • Apply sound data architecture principles to balance performance, scalability, cost, and maintainability.
  • Champion best practices in data engineering, including testing, observability, and operational readiness.
  • Participate in Agile development processes, including backlog refinement, sprint planning, and cross-team coordination.


Required Qualifications

  • Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related field (or equivalent practical experience).
  • 5+ years of professional experience in data engineering, working with large-scale datasets in production environments.


Technical Expertise:

  • Databricks (Required): Hands-on experience building and operating Databricks workflows, notebooks, and production pipelines using Apache Spark (Spark SQL and/or PySpark).
  • Delta Lake (Required): Experience designing and maintaining Delta Lake tables, including incremental processing, merges/upserts, schema evolution, and performance optimization.
  • Delta Live Tables (DLT) (Required): Experience building and managing DLT pipelines with a focus on dependencies, incremental processing, and monitoring.
  • Databricks Governance (Required): Experience with Unity Catalog or similar governance frameworks, including secure data sharing and access control.
  • Apache Spark: 4+ years of experience building and optimizing batch and/or streaming data pipelines.
  • Streaming Technologies: 3+ years of experience with Apache Kafka or managed equivalents (e.g., Confluent), including scaling, throughput, and fault tolerance.
  • AWS Ecosystem: 3+ years of experience with AWS services (e.g., S3, IAM, and related analytics integrations).
  • SQL: Strong proficiency in writing and optimizing complex queries and translating business logic into data models.
  • Proven experience delivering ETL/ELT pipelines in a lakehouse environment, including incremental loads and data quality enforcement.
  • Experience working in Agile environments (Scrum, Kanban, SAFe, or similar).


Preferred Qualifications

  • Experience with Snowflake or other large-scale analytical databases.
  • Familiarity with NoSQL databases such as Cassandra.
  • Experience with enterprise messaging systems (e.g., TIBCO EMS, IBM MQ) in addition to Kafka.


Role Summary

  • Senior Data Engineers are responsible for the architecture, design, and delivery of scalable analytics and streaming solutions that transform enterprise data into governed, high-quality datasets for business intelligence and advanced analytics.
  • They operate across the full data lifecycle—partnering with business stakeholders, defining requirements, designing lakehouse architectures, and building production-grade pipelines using technologies such as Databricks, Delta Lake, Delta Live Tables, Unity Catalog, Apache Spark, Kafka, AWS services, and SQL.
  • Success in this role requires deep technical expertise, strong problem-solving skills, and the ability to communicate effectively across technical and business teams to deliver measurable, data-driven outcomes.


Senior Data Engineers (Lead/Senior/Intermediate) are responsible for designing, building, and operating scalable data pipelines and lakehouse architectures using Databricks, Apache Spark (Spark SQL/PySpark), Delta Lake, Delta Live Tables, Unity Catalog, Kafka, AWS (S3, IAM), and SQL. Core duties include defining data requirements; ingesting, integrating, and processing large structured and unstructured datasets; developing standardized and ad hoc datasets; implementing data quality and governance frameworks (access control, retention, sensitive data handling); optimizing batch and streaming pipelines; and collaborating with data science, BI, and business stakeholders to deliver analytics and machine learning solutions within Agile environments. Candidates must hold a bachelor’s degree in a related field and have 5+ years of data engineering experience, including 4+ years with Spark, 3+ years with Kafka and AWS, and proven expertise in ETL/ELT pipelines, incremental processing, and production-grade systems in lakehouse environments. Required skills include advanced SQL, data modeling, performance optimization, and operational best practices (testing, observability). Preferred qualifications include experience with Snowflake, NoSQL databases (e.g., Cassandra), and enterprise messaging systems (e.g., TIBCO EMS, IBM MQ).


Similar Jobs

Explore other opportunities that match your interests

Senior Principal AI/ML Engineer

Data Science
•
8h ago

Premium Job

Sign up is free! Login or Sign up to view full details.

•••••• •••••• ••••••
Job Type ••••••
Experience Level ••••••

palo alto networks unit 42

United State

Clinical Data Scientist

Data Science
•
10h ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Associate

netrolynx ai

United State
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

pipe

United State

Subscribe our newsletter

New Things Will Always Update Regularly