Senior Machine Learning Engineer (Synthetic Data Generation)

impinno United State
Visa Sponsorship
Apply
AI Summary

Rockfish Data is seeking an experienced Machine Learning Engineer to build and deploy end-to-end ML solutions on their outcome-centric synthetic data platform. This role involves designing ML pipelines, integrating systems, and collaborating with customers to drive innovation in AI model scaling. Ideal candidates have strong production ML experience, LLM/NLP integration skills, and expertise in PyTorch/TensorFlow.

Key Highlights
Design, build, and maintain end-to-end ML pipelines from data ingestion to deployment.
Integrate ML systems with backend infrastructure and ensure seamless production deployment.
Collaborate with customers to understand requirements and iterate on ML solutions.
Shape the ML infrastructure of a growing synthetic data generation platform.
Technical Skills Required
Python PyTorch TensorFlow Hugging Face OpenAI AWS GCP Azure LLMs NLP GANs VAEs Diffusion Models Transformers Federated Learning Differential Privacy GenAI
Benefits & Perks
Competitive compensation (adjusted for location)
Visa sponsorship available (H1B transfers)
Small team environment with direct impact

Job Description


Location: Pittsburgh, PA or San Francisco, CA (Hybrid or Onsite)


About Rockfish 

Rockfish Data is the industry’s first outcome-centric data generation platform helping organizations overcome data bottlenecks — such as sparsity, privacy, and accessibility — through high-fidelity synthetic data. We enable teams to safely innovate and scale AI models when real-world data is limited or constrained.


About the Role 

We're seeking an experienced Machine Learning Engineer to join our growing team. You'll work on building and deploying end-to-end machine learning solutions that directly impact our customers. This role offers the opportunity to work closely with our technical leadership and shape the ML infrastructure of our platform.


What You’ll Do

  • Design, build, and maintain end-to-end machine learning pipelines from data ingestion through deployment
  • Handle data preprocessing, model training, tuning, and production deployment
  • Integrate ML systems with backend infrastructure and ensure seamless deployment
  • Work directly with customers to understand requirements and iterate on solutions
  • Make informed trade-offs between model performance, scalability, and business needs


Required Qualifications

  • 4+ years building and deploying ML systems in production.
  • 2+ years with LLMs/NLP model integration (Hugging Face, OpenAI, custom stacks).
  • Deep experience with PyTorch or TensorFlow in distributed environments.
  • Strong statistical, probabilistic, and generative modeling foundation.
  • Experience with AWS, GCP, or Azure, and MLOps best practices.
  • Strong Python programming skills
  • Hands-on experience with traditional machine learning or end-to-end ML pipelines
  • Demonstrated ability to take ML projects from conception through production deployment
  • Understanding of how ML systems integrate with broader software architecture
  • Experience with model development, training, and tuning in production environments


Preferred Qualifications

  • Ph.D. or equivalent experience in CS, EE, Math, or related field.
  • Experience with GANs, VAEs, diffusion models, and transformers.
  • Background in privacy-preserving ML (federated learning, differential privacy).
  • Track record of technical leadership in startup or high-growth teams.
  • Contributions or publications in ML research.
  • Customer-facing experience or working directly with stakeholders
  • Background in synthetic data generation or GenAI (optional)
  • Previous experience at startups or fast-paced environments


What We Offer 

  • Competitive compensation (adjusted for location)
  • Visa sponsorship available (H1B transfers for Right candidate)
  • Small team environment with direct impact


📧 Apply at careers@rockfish.ai /debjani@rockfish.ai


Subscribe our newsletter

New Things Will Always Update Regularly