We are seeking a highly skilled Gen AI Inferencing Software Engineer to design, build, and operate reusable toolkits supporting Gen AI RAG capabilities. This role focuses on developing scalable inferencing frameworks, AI platforms, and automation systems that power enterprise-grade AI/ML solutions. The ideal candidate will have strong experience in Python-based large-scale systems and hands-on expertise in Gen AI lifecycle, RAG pipelines, and inference frameworks.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Job Description
π Location: Addison, TX / Charlotte, NC (Onsite)
πΌ Employment: W2 Only
π Visa: USC / H4EAD (please do not apply except these)
π Relocation: Same state only
We are looking for a highly skilled Software Engineer β Gen AI Inferencing to design, build, and operate reusable toolkits supporting GenAI RAG capabilities. This role focuses on developing scalable inferencing frameworks, AI platforms, and automation systems that power enterprise-grade AI/ML solutions.
If you have strong experience in Python-based large-scale systems and hands-on expertise in GenAI lifecycle, RAG pipelines, and inference frameworks β we want to hear from you!
Key Responsibilities:
- Design, develop, and maintain reusable GenAI inferencing toolkits and RAG frameworks
- Build scalable AI/ML solutions meeting functional, non-functional, and compliance standards
- Deploy and optimize models using vLLM or Triton Inference Server in containerized environments
- Automate CI/CD pipelines and release workflows
- Develop automated testing frameworks (integration, regression, performance)
- Perform proof-of-concepts (POCs) and risk mitigation spikes
- Collaborate with product teams, data scientists, and stakeholders
- Mentor engineers and promote DevOps and automation best practices
Searching for Development & Programming roles that provide visa sponsorship? Connect with international employers through Development & Programming Jobs with Visa Sponsorship opportunities actively seeking talented professionals.
Required Qualifications:
Explore our comprehensive directory of visa sponsorship jobs from employers worldwide who are ready to sponsor talented international professionals.
- 5+ years of OOP development experience (Python / Scala / Java)
- Strong hands-on experience with GenAI / AI-ML lifecycle management
- Experience building RAG pipelines (chunking, embeddings, retrieval, reranking, summarization)
- Model deployment experience with vLLM / Triton Inference Server
- Experience with containerization and CI/CD automation
- Experience building API-based applications (FastAPI, JWT, API Gateway)
- Hands-on DevOps experience (Git, Jenkins, SonarQube, pytest, Artifactory, Ansible)
- Experience working in large collaborative multi-repo environments
Interested in opportunities specifically in United State? Discover our dedicated Visa Sponsorship Jobs in United State page featuring roles from top employers in this location.
Desired Qualifications:
- Experience building GenAI inferencing platforms using open-source toolsets
- AI Gateway, observability, and policy store implementation
- Strong research mindset with ability to prototype innovative solutions
- Experience driving quality, automation, and experimentation culture
Key Skills:
Application Development | GenAI | RAG | Python | MLOps | DevOps | Architecture | Automation | CI/CD | Containerization | MongoDB | Redis | React/Angular | API Development | Test Engineering | Collaboration
Similar Jobs
Explore other opportunities that match your interests
elevate recruitment
Lorven Technologies Inc.