Jobgether is seeking a Senior Networking Solution Test Engineer to ensure the reliability and performance of complex AI clusters. This role requires strong debugging intuition and the ability to reproduce and analyze real-world customer scenarios. The ideal candidate will have experience in Linux-based networking, system testing, and complex debugging environments.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Job Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Networking Solution Test Engineer - AI Cluster Debugging in Switzerland.
This role sits at the forefront of large-scale AI infrastructure validation, where networking, systems engineering, and artificial intelligence workloads converge. You will be responsible for ensuring the reliability and performance of complex AI clusters built on high-speed interconnect technologies such as NVLink, Ethernet, and InfiniBand. Working in a highly technical and collaborative environment, you will investigate deep system-level issues spanning hardware, drivers, networking stacks, and AI frameworks. The position requires strong debugging intuition and the ability to reproduce and analyze real-world customer scenarios in advanced test environments. You will contribute directly to the stability and scalability of next-generation AI training and inference systems used at massive scale. This is a hands-on engineering role where your analysis and findings directly shape product quality and system performance.
Accountabilities
- Design and review test strategies and product requirements for NVLink, Ethernet, and InfiniBand-based AI cluster systems.
- Build and maintain realistic, large-scale test environments replicating customer-like AI infrastructure, including heterogeneous hardware and software stacks.
- Lead end-to-end system debugging across hardware, firmware, networking, and AI software layers to identify and resolve root causes.
- Analyze logs, inspect source code, and validate fixes across components such as NICs, DPUs, switches, and AI communication libraries.
- Collaborate closely with development teams to debug and optimize protocols such as NCCL, RoCE, and RDMA.
- Define, design, and guide automation efforts for robust testing frameworks producing actionable logs, metrics, and traces.
- Execute regression, performance, functional, and scalability testing, and deliver clear, data-driven technical reports.
- Profile and benchmark AI training and inference workloads, correlating application behavior with system and network performance metrics.
Looking to advance your QA & Testing career with relocation support? Explore QA & Testing Jobs with Relocation Packages that include comprehensive packages to help you move and settle in your new role.
- Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or equivalent hands-on experience in systems/network engineering.
- 8+ years of experience in Linux-based networking, system testing, and complex debugging environments.
- Strong expertise in Linux networking tools and debugging utilities (e.g., tcpdump, ethtool, iproute2, perf).
- Proven experience in production-grade troubleshooting, hypothesis-driven debugging, and root cause analysis under pressure.
- Solid understanding of NIC architecture, offloads, queue management, and driver/firmware interactions.
- Deep knowledge of AI networking technologies such as NCCL, RoCE, and RDMA.
- Ability to read, understand, and debug source code in C/C++, Python, or similar languages.
- Strong scripting and automation skills using Bash, Python, and/or Ansible.
- Experience working in fast-evolving technical environments with strong adaptability and learning ability.
- Excellent analytical, communication, and collaboration skills with strong ownership mindset.
- Competitive compensation aligned with senior-level expertise and Swiss market standards.
- Opportunity to work on cutting-edge AI cluster and high-performance networking technologies.
- Exposure to large-scale systems powering advanced AI training and inference workloads.
- Highly technical, research-driven engineering environment with strong innovation focus.
- Collaborative international team working on next-generation infrastructure challenges.
- Access to complex, large-scale test environments and advanced debugging tools.
- Inclusive workplace culture supporting diversity, equity, and professional growth.
- Relocation and accommodation of accessibility needs where applicable.
Discover our full range of relocation jobs with comprehensive support packages to help you relocate and settle in your new location.
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Why Apply Through Jobgether?
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Similar Jobs
Explore other opportunities that match your interests
Quality & Technical Leadership Lead
Cricut
Mechanical Fluid Engineer
Blue Origin
Command Systems Lab Test Manager 3