AI Infrastructure Engineer (On-Premise)

Remote
Apply
AI Summary

Design, develop, and optimize AI infrastructure for on-premise environments. Collaborate with AI/ML teams to deploy and scale models. Strong knowledge of Linux, Windows, and GPU-based systems required.

Key Highlights
Design and develop AI infrastructure for on-premise environments
Collaborate with AI/ML teams to deploy and scale models
Strong knowledge of Linux, Windows, and GPU-based systems required
Key Responsibilities
Deploy and operate AI/LLM workloads on GPU-based systems
Design and develop production-grade APIs for AI services
Manage and maintain on-premise infrastructure including servers, storage, and networking systems
Technical Skills Required
Linux Windows Python Golang Docker NVIDIA GPU-based systems FastAPI gRPC REST VMware Hyper-V TCP/IP DNS VLANs VPNs firewalls
Benefits & Perks
Work on enterprise-grade infrastructure and mission-critical systems
Exposure to real-world AI and infrastructure environments
Fully remote work setup with flexible collaboration
Nice to Have
Experience working in data center environments
Basic scripting knowledge (Shell, Python, or PowerShell)

Job Description


AI Infrastructure Engineer (On-Premise)

Employment Type: Full-time

Location: Remote

Experience: 1–5 Years

About the Role

AppXcess Technologies is seeking a skilled AI Infrastructure Engineer with a strong focus on on-premise infrastructure environments. This role is responsible for building, managing, and optimizing infrastructure that supports AI workloads, including GPU-based systems and enterprise-grade server environments.

You will work across both AI systems and core infrastructure layers, ensuring high performance, reliability, and scalability of mission-critical platforms.

Key Responsibilities

AI Infrastructure & Model Operations

  • Deploy and operate AI/LLM workloads on GPU-based systems (NVIDIA environments)
  • Run and optimize inference servers such as vLLM, Triton, or similar frameworks
  • Monitor GPU utilization, memory, and system performance for efficient AI execution

System Design & Backend Engineering

  • Design and develop production-grade APIs (FastAPI / gRPC / REST) for AI services
  • Architect asynchronous systems using queues, workers, and distributed pipelines
  • Build scalable backend systems for high-concurrency AI workloads

Performance, Scalability & Reliability

  • Plan infrastructure capacity (GPU usage, latency, throughput optimization)
  • Implement batching, rate limiting, and workload optimization strategies
  • Ensure system resilience with fault tolerance and graceful degradation

On-Premise Infrastructure Management

  • Manage and maintain on-premise infrastructure including servers, storage, and networking systems
  • Configure and administer physical and virtual servers across Linux and Windows environments
  • Implement and support virtualization platforms such as VMware or Hyper-V
  • Manage networking components including switches, routers, firewalls, and load balancers
  • Monitor infrastructure performance, availability, and system health
  • Perform system upgrades, patching, backups, and disaster recovery processes
  • Troubleshoot hardware, network, and system-related issues effectively
  • Ensure infrastructure security, access controls, and compliance with best practices
  • Maintain clear documentation of configurations, processes, and infrastructure architecture
  • Support capacity planning and infrastructure scaling requirements

Observability & Operations

  • Build monitoring systems for latency, throughput, GPU usage, and system health
  • Diagnose performance bottlenecks and infrastructure issues
  • Ensure stable operations under high-load and production conditions

Collaboration & Productionization

  • Work closely with AI/ML teams to deploy and scale models into production
  • Abstract infrastructure complexity for application and product teams
  • Translate experimental AI systems into reliable production deployments

Requirements

Core Requirements

  • 1–5 years of experience in infrastructure engineering, system administration, or backend systems
  • Strong knowledge of Linux and Windows server environments
  • Hands-on experience working with GPU-based systems or compute-intensive workloads
  • Strong programming skills in Python (Golang is a plus)
  • Understanding of distributed systems and asynchronous processing
  • Experience with Docker or containerized environments
  • Ability to analyze and optimize system performance (latency, throughput, cost)

Infrastructure & Networking

  • Experience with virtualization technologies such as VMware or Hyper-V
  • Solid understanding of networking concepts (TCP/IP, DNS, VLANs, VPNs, firewalls)
  • Experience with storage systems, backup strategies, and disaster recovery
  • Familiarity with enterprise hardware such as rack servers and storage systems

General Skills

  • Strong troubleshooting and analytical thinking
  • Ability to work independently in a remote setup
  • Good communication and collaboration skills

Strong Practical Experience (Preferred)

  • Deployed AI/ML models in production environments
  • Debugged GPU-related issues such as memory constraints or latency bottlenecks
  • Designed systems for high concurrency and compute-heavy workloads
  • Redesigned synchronous systems into asynchronous architectures

Additional Considerations

  • Experience working in data center environments
  • Basic scripting knowledge (Shell, Python, or PowerShell)
  • Familiarity with monitoring and infrastructure management tools

What We're Looking For

  • Engineers who understand infrastructure deeply, not just surface-level tools
  • Strong problem-solvers who can optimize systems under real-world constraints
  • Individuals comfortable working across AI systems and core infrastructure layers
  • Hands-on professionals with real production experience

Who May Not Be a Fit

  • Candidates with only API-level AI exposure and no infrastructure experience
  • Engineers focused only on frontend or prompt engineering
  • Profiles without experience in handling compute-heavy or infrastructure systems

What We Offer

  • Work on enterprise-grade infrastructure and mission-critical systems
  • Exposure to real-world AI and infrastructure environments
  • Fully remote work setup with flexible collaboration
  • Growth opportunities in AI infrastructure and system engineering
  • Health insurance coverage
  • Opportunities for international travel based on project needs
  • Supportive and high-performance work culture

Similar Jobs

Explore other opportunities that match your interests

Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

ARDEM Incorporated

India

Brand Manager

Networking
2d ago
Visa Sponsorship Relocation Remote
Job Type Contract
Experience Level Director

giftsbazaar.in

India

GRC Analyst

Networking
1w ago
Visa Sponsorship Relocation Remote
Job Type Full-time
Experience Level Mid-Senior level

Juniper Square

India

Subscribe our newsletter

New Things Will Always Update Regularly