Principal Operations Engineer Hardware — Data Center Operations

Jadi ahli operasional hardware untuk data center AI skala besar

Anda akan memastikan keandalan hardware GPU, server, dan infrastruktur pendukung untuk proyek AI. Anda akan menjadi jembatan antara operasional, engineering, dan tim pelanggan.

Kenapa Menarik?

Bekerja di proyek infrastruktur AI skala besar dengan fokus pada kecepatan dan skala.

Skills Wajib

hardware operationsdata center managementai infrastructuresystem reliabilitytechnical leadership

Konteks Indonesia

Overlap Jam Kerja:: Fleksibel — atur jam kerjamu sendiri

Keywords

principal operations engineerhardwaredata centeraihyperscalefull-timeremote

Lihat Deskripsi Asli dari RemoteOK

Deskripsi asli dari RemoteOK

About Fluidstack We exist to make humanity more free. For most of human history, you farmed or you starved. Technology gave people more time for the things they wanted to do, instead of things they had to do. Powerful AI will be the biggest lever for human choice we've ever built - but only if models are aligned with what humanity actually wants. There are groups building AI who don't share these goals. Whoever deploys frontier compute infrastructure fastest will decide whether AI expands human freedom or shrinks it. We're singularly focused on delivering 10 to 100s of GWs of compute faster than anyone else, rethinking every layer of the stack. We acquire power, design and build data centers, and operate them - with teams spanning hardware and software. Speed and scale are our key differentiators. Come be a part of building civilization-scale infrastructure for AI. We hire people who care deeply about this problem space. If that is you, please apply! About the Role We are seeking a Principal Operations Engineer, Hardware to serve as the most senior technical authority for the operational hardware fleet across our hyperscale AI data center portfolio. AI infrastructure lives and dies on the reliability of the compute itself — this role exists to ensure that the GPU systems, servers, and supporting hardware we deploy at scale are operated, maintained, and continuously improved at the standard the workload demands. You will operate as the technical arm of senior operations leadership in the field — leading site assessments and operational audits, driving the technical readiness of teams ahead of site activation, reviewing hardware platforms and integration designs from an operational lens, and feeding operational learnings back into the hardware engineering, deployment, and supply chain organizations as we shift toward a productized, repeatable build model. You will be a force multiplier across our site hardware leads, deployment teams, and reliability engineers, and the connective tissue between hardware operations, hardware engineering, network, facilities, and customer-facing teams. The ideal candidate has spent a career operating hardware at scale — in hyperscale data centers, large HPC environments, or comparable 24/7 infrastructure — and is equally comfortable diagnosing a stubborn boot failure on the floor, leading a fleet-wide root cause investigation, and pushing back on a vendor on a flawed RMA process. Formal engineering credentials are valued but not required — practical depth, judgment under pressure, the ability to teach, and the discipline to keep critical infrastructure running through change are what define this role. Responsibilities 10+ years of hands-on experience operating mission-critical hardware infrastructure, with at least 5 years as the senior technical voice on a site, campus, or fleet. Data center operations experience strongly preferred; hyperscale, large HPC, cloud, or other mission-critical compute infrastructure experience considered. Deep working command of GPU systems, server platforms, storage infrastructure, firmware lifecycle management, and hardware diagnostics — earned in the field, not from a textbook. Demonstrated ability to author, approve, and execute high-risk MOPs and change records in live production environments. A track record of leading root cause analysis on significant hardware events and driving corrective actions to closure. A track record of holding OEMs, ODMs, service vendors, and deployment partners accountable — you know how to enforce a standard without burning the relationship. Strong written communication: operational health assessments, RCAs, procedure reviews, and design review feedback are second nature. Comfort operating as the senior technical voice across operations, hardware engineering, network, facilities, supply chain, and customer-facing teams. Willingness to travel extensively across the fleet. 50-75%. Preferred Qualifications Bachelor's degree in Computer En

Ruang Iklan

Biarkan perusahaan menemukanmu — buat profil gratis →

Lamar gratis

Akun gratis · tanpa kartu kredit · Masuk

Pro Rp39rb/bln · lamar tanpa batas + resume AI

Tips

Temukan lebih banyak kerja remote di Contra

Contra adalah platform remote-first untuk freelancer dan kontraktor global. Profil gratis — pakai link kami biar prosesnya lebih gampang.

Gabung Contra — gratis

Tautan referral · kami dapat hadiah kecil, tanpa biaya buat kamu.

Situs sumber mungkin diblokir ISP Indonesia

Beberapa ISP Indonesia (Telkomsel, Indihome) memblokir RemoteOK. Kalau tombol Apply tidak terbuka, coba pakai data seluler atau VPN.

Tips: ganti jaringan atau aktifkan VPN, lalu klik Apply lagi.

Perusahaan

Fluidstack

Sumber

RemoteOK

Gaji

$XX,XXX

Tipe Pekerjaan

full time

Lokasi

Worldwide Remote · Remote

Kategori

Design

Level

lead

DipostingFresh

9 Jun 2026

Bagikan lowongan ini

Bantu temanmu nemu kerja remote berikutnya.

Lamar gratis

Akun gratis · tanpa kartu kredit · Masuk

Pro Rp39rb/bln · lamar tanpa batas + resume AI