Infrastructure Engineer (On-Premise)

jan • Taiwan

Remote

This Job is No Longer Active This position is no longer accepting applications

Job Description

Homebrew is an AI R&D Lab. We train our own models, are the creators and maintainers of popular open-source AI tools:

Jan: Desktop Copilot (>1 million downloads)
Cortex: Local, open-source alternative to OpenAI Platform
Menlo: GPU Training Cluster

We are a fully remote company. In the long term, our objective is to train useful, safe AI that helps improve humanity.

Job Description

Homebrew is looking for an Infrastructure Engineer to help run our GPU Training Cluster, internal GPU Cloud. Please note that this is an On-Premise role, as we build our own infrastructure.

Responsibilities

Design and maintain the organization's infrastructure, including compute and storage nodes, high-bandwidth networking infrastructure, and security and monitoring infrastructure
Design and maintain software for infrastructure management and orchestration (e.g. Openstack, Kubeflow, Proxmox, etc)
Participate in incident response and resolution to ensure high availability and performance
Develop and maintain solutions for day-to-day operational administration, system/data backup, disaster recovery, and security/performance monitoring.
Collaborate with Engineering team to implement DevSecOps practices (e.g. IAAC, CI/CD)

Requirements

Familiar with on-premise Infrastructure (e.g. Racks with power, storage, compute, network nodes)
Ability to do basic to intermediate hardware troubleshooting, servicing and repairs
[Plus] Experience with Slurm, Kubeflow or alternative cluster orchestration tools
[Plus] Experience with Openstack, VMWare, Proxmox or alternative cloud orchestrator tools
[Plus] Experience with designing GPU Clusters or HPC systems (inter-cluster networking)
[Plus] Familiarity with software-defined storage technologies (Ceph, ZFS, NFS, etc.)

Benefits

We pay an “all-in” pay and you will cover your own insurance/medical from the amount.
14 days leave (and unlimited sick days)
Annual equipment budget (once 2 month probation has been completed)

Job Overview

Posted Date Jul 12, 2024

Experience Level Full-time

Location Taiwan

Category Networking

Company jan

Mentioned Skills

Similar Jobs

Explore other opportunities that match your interests

Senior Network Engineer – Cloud, Data Center & Automation

Networking

•

2w ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

Hamilton Barnes 🌳

Taiwan

Senior Network Engineer

Networking

•

2w ago

Visa Sponsorship Relocation Remote

Job Type Full-time

Experience Level Mid-Senior level

Hamilton Barnes 🌳

Taiwan

Network Engineer (Network Automation)

Networking

•

4m ago

Visa Sponsorship Relocation Remote

Job Type Contract

Experience Level Mid-Senior level

Digital Skills ltd

European Union

Infrastructure Engineer (On-Premise)

Job Description

Job Overview

Mentioned Skills

Industries

Similar Jobs

Senior Network Engineer – Cloud, Data Center & Automation

Hamilton Barnes 🌳

Senior Network Engineer

Hamilton Barnes 🌳

Network Engineer (Network Automation)

Digital Skills ltd

Subscribe our newsletter