We are seeking GPU kernel optimization experts to contribute to a project with a leading AI lab. You'll analyze and optimize GPU kernels for performance, efficiency, and hardware utilization using profiler-guided analysis. Requirements include fluency in C++17, Python, and at least one GPU programming model such as CUDA or HIP.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
This role is for one of our clients
Compensation: $80-$100 per hour
We are seeking GPU kernel optimization experts to contribute to a project with a leading AI lab. This opportunity is designed for freelancers with strong C++ skills, practical GPU programming experience, and the ability to improve kernel performance using profiler-guided analysis. You'll help evaluate, optimize, and reason about GPU kernels across modern hardware environments. This is a contract-based opportunity for specialists who enjoy squeezing performance out of modern GPU architectures.
Requirements
Key Responsibilities
- Analyze and optimize GPU kernels for performance, efficiency, and hardware utilization
- Use profiler metrics such as L2 cache hit rate, L2 throughput, occupancy, and related signals to guide kernel improvements
- Review GPU kernel implementations and identify bottlenecks without requiring extensive background in the underlying algorithms
- Write, modify, and reason about C++17, Python, and GPU programming code
- Apply CUDA, HIP, shader programming, or related kernel programming expertise to improve performance outcomes
- Document optimization decisions clearly, including when specific profiler metrics are or are not useful
Interested in remote work opportunities in Development & Programming? Discover Development & Programming Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
- Available to work at least 20 hrs/wk
- Fluent in core C++ features through C++17
- Working knowledge of Python and Git
- Fluent in at least one GPU programming model, such as CUDA, HIP, Slang, HLSL, GLSL, or related kernel programming
- At least 1 year of professional or graduate-level research experience working with GPUs
- Strong understanding of GPU profiler performance metrics and how to use them to optimize kernels
- Ability to optimize GPU kernels without needing deep prior context on every algorithm
- Experience with CUDA, HIP, CUDA C++ Core Libraries, inline PTX assembly, or tensor core-level optimization is a plus
- Experience optimizing kernels for NVIDIA Blackwell hardware is a plus
- Familiarity with NSight Compute is a plus
- Prior experience with GPU hardware organizations such as NVIDIA, AMD, or Qualcomm is a plus
- Open-source contributions related to GPU kernel optimization are a plus
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
- Submit your resume or relevant technical background to get started
- Qualified applicants may be asked to complete a brief technical assessment or submit additional information
Contract and Payment Terms
- You will be engaged as an independent contractor
- This is a fully remote role that can be completed on your own schedule
- Projects can be extended, shortened, or concluded early depending on needs and performance
- Your work will not involve access to confidential or proprietary information from any employer, client, or institution
- Payments are weekly on Stripe or Wise based on services rendered
- Please note: We are unable to support H1-B or STEM OPT candidates at this time
Similar Jobs
Explore other opportunities that match your interests
blueingreen