Job Description
Dice is the leading career destination for tech experts at every stage of their careers. Our client, TOPSYSIT, is seeking the following. Apply via Dice today!
Role: AI Data Engineer
Client: Ally Bank
Location: Detroit, MI (3 days onsite, 2 days remote)
Duration: Long Term
Job Description:
We are seeking an experienced AI Data Engineer to join our team. The ideal candidate will design, build, and maintain scalable data infrastructure and pipelines that power AI, machine learning (ML), agentic AI, and generative AI (GenAI) initiatives. This position requires deep expertise in data engineering and a strong understanding of the unique data needs of AI models.
Key Responsibilities:
- Build AI-ready data pipelines: Design, construct, and optimize ETL/ELT pipelines tailored for AI and ML models.
- Architect data solutions: Develop and manage data lakes, data warehouses, and vector databases to support AI workloads.
- Ensure data quality & governance: Implement data validation, security, and governance policies to maintain integrity and compliance.
- Support AI model lifecycle: Collaborate with data scientists and ML engineers to prepare and manage large-scale datasets for training and deployment.
- Manage real-time data: Develop streaming pipelines using tools like Apache Kafka for real-time AI applications.
- Optimize cloud infrastructure: Build and scale AI data solutions on AWS cloud platforms.
- Deploy AI models: Automate model training and deployment via APIs and microservices.
- Monitor & troubleshoot: Implement data observability to detect data drift, ensure pipeline health, and resolve data issues.
- AI-assisted development: Leverage tools like Copilot in Microsoft Fabric notebooks to generate, explain, and optimize data workflows.
- Education: Bachelor s or Master s degree in Computer Science, Data Science, Engineering, or related field.
- Experience: Proven experience in data engineering supporting AI/ML projects.
- Programming: Strong in Python and SQL (familiarity with Java/Scala a plus).
- ML Frameworks: Experience with TensorFlow, PyTorch, Scikit-learn, and LLM tools (e.g., LangChain, LlamaIndex).
- Big Data Tools: Hands-on experience with Apache Spark, Hadoop, etc.
- Cloud Platforms: Proficiency in AWS (preferred), Azure, or Google Cloud Platform with AI data services.
- Databases: Skilled in SQL, NoSQL, and vector databases (for GenAI use cases).
- DevOps/MLOps: Experience with CI/CD, Docker, MLflow, and related automation tools.
- Mode: Hybrid (3 days onsite in Detroit, MI)
- Interview: Virtual (no in-person required initially)
- Relocation: Required after confirmation