Langsung ke konten utama
Kembali ke Lowongan

Systems Engineer, HPC (US & Canada)

Operasi dan kelola infrastruktur Linux skala besar untuk AI platform

Sebagai Systems Engineer, Anda akan mengelola dan memelihara lingkungan Linux skala besar, termasuk cluster HPC dan infrastruktur cloud. Tugas Anda mencakup pemantauan kesehatan sistem, troubleshooting insiden, dan memastikan ketersediaan tinggi. Anda juga akan membantu menskalakan infrastruktur hingga ribuan node dan mengoptimalkan penggunaan sumber daya. Selain itu, Anda akan mengotomatiskan tugas operasional menggunakan alat seperti Python, Ba

Kenapa Menarik?

Bergabung dengan tim yang sedang membangun infrastruktur untuk mendukung sistem AI skala besar dan petabyte.

Tanggung Jawab Utama

  • Pemantauan kesehatan sistem, troubleshooting insiden, dan memastikan ketersediaan tinggi
  • Mendukung workload produksi dan riset di berbagai lingkungan
  • Membantu menskalakan cluster hingga ratusan hingga ribuan node
  • Mengotomatiskan tugas operasional menggunakan alat seperti Python, Bash, Ansible, atau Terraform
  • Berkontribusi pada desain dan keputusan arsitektur sistem

Persyaratan

  • Pengalaman kuat dalam administrasi sistem Linux
  • Pengalaman bekerja di lingkungan skala besar: cluster HPC atau infrastruktur cloud
  • Pengalaman dengan job schedulers seperti Slurm
  • Kemampuan troubleshooting yang kuat di berbagai sistem, hardware, dan jaringan

Skills Wajib

linuxhpccloudautomationscalabilityperformance optimizationtroubleshooting

Konteks Indonesia

Overlap Jam Kerja:
Overlap minimal — jam kerja berlawanan

Keywords

systems engineerlinux administrationhpccloud infrastructureautomationscalabilityperformance optimizationremotefull-timeai
Lihat Deskripsi Asli dari Lever Postings

Deskripsi asli dari Lever Postings

About Mistral   At Mistral AI, we build high-performance, open, and efficient AI systems designed to power the next generation of applications. Our infrastructure combines large-scale distributed systems, cloud platforms, and HPC environments to support cutting-edge research and production workloads. We are a collaborative, low-ego, and highly technical team, operating across Europe, the US, and beyond. As we scale rapidly, we are building the foundational infrastructure to support thousands of nodes and petabyte-scale systems. Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on https://mistral.ai/careers.   About the Role We are looking for Systems Engineers / System Administrators to help design, operate, and scale the infrastructure behind Mistral’s AI platforms. This is a hands-on, hybrid role combining: Systems administration (operating and troubleshooting large-scale Linux environments) Systems engineering (automation, scalability, and performance improvements) You’ll work closely with infrastructure, HPC, and research teams to ensure our clusters and platforms run reliably at scale.   What You’ll Work On Core Systems Operations Operate and maintain large-scale Linux environments (bare metal, clusters, cloud) Monitor system health, troubleshoot incidents, and ensure high availability Support production and research workloads across multiple environments       Scaling Infrastructure Help scale clusters toward hundreds to thousands of nodes Work on systems handling petabyte-scale storage Improve performance, reliability, and resource utilisation       Automation & Engineering Automate operational tasks using tools like Python, Bash, Ansible, or Terraform Improve deployment, provisioning, and system lifecycle management Contribute to system design and architecture decisions       Cross-Functional Collaboration Work closely with: HPC / infrastructure teams Platform / DevOps engineers Research teams Act as a bridge between users and infrastructure What We’re Looking For Must-have Strong Linux systems administration experience (core requirement) Experience working in large-scale environments: HPC clusters or cloud infrastructure Experience with Job schedulers (e.g. Slurm) Solid troubleshooting skills across systems, hardware, and networks Nice-to-have (any of these) We are not expecting everything — strong depth in one area is valuable. Containers / orchestration (e.g. Kubernetes) Storage systems (e.g. Ceph, Lustre, NFS) Networking fundamentals (Ethernet; InfiniBand is a plus) Infrastructure as Code / automation tooling GPU or AI/ML experience Profile We Value Pragmatic problem solver who can operate in fast-scaling environments Comfortable working across multiple domains (“Swiss army knife” mindset) Able to go deep in one area while learning others Low-ego, collaborative, and hands-on —------------------------------------------------------------------ Why Join Mistral? Impact: Play a pivotal role in scaling Mistral’s cutting-edge AI infrastructure. Growth: Opportunity to shape data centre operations from the ground up in a high-growth startup environment. Collaboration: Work with a talented, cross-functional team passionate about AI and technology. Flexibility: Competitive compensation, benefits, and the chance to contribute to revolutionary projects.

Lamar gratis

Akun gratis · tanpa kartu kredit · Masuk

Pro Rp39rb/bln · lamar tanpa batas + resume AI

Lihat 5 lowongan serupa →

Rekrut di North America saja

Pemberi kerja ini sepertinya hanya merekrut di wilayah di atas. Pastikan kamu memenuhi syarat direkrut dari Indonesia sebelum melamar.

Perusahaan
Mistral AI
Sumber
Lever Postings
Tipe Pekerjaan
full time
Lokasi
Worldwide Remote · Remote
Kategori
Level
mid
DipostingNew
17 Jun 2026

Bagikan lowongan ini

Bantu temanmu nemu kerja remote berikutnya.

Lamar gratis

Akun gratis · tanpa kartu kredit · Masuk

Pro Rp39rb/bln · lamar tanpa batas + resume AI