Senior Software Engineer - Web Data Team
Bangun sistem crawling dan ekstraksi data skala besar untuk ZoomInfo
Anda akan merancang dan mengimplementasikan komponen sistem crawling dan ekstraksi web skala besar yang memproses miliaran halaman. Anda akan bekerja dengan infrastruktur cloud di GCP dan AWS, terutama di GKE. Tugas Anda termasuk menulis kode produksi dalam Java dan Python, membangun dan mengoperasikan pipeline ETL/ELT, serta meningkatkan observabilitas dan keandalan sistem.
Kenapa Menarik?
Bekerja di tim yang berfokus pada pengembangan infrastruktur crawling dan ekstraksi data skala besar dengan dampak tinggi.
Tanggung Jawab Utama
- Merancang dan mengimplementasikan komponen pipeline crawling dan ekstraksi web yang dapat diskalakan dan tahan terhadap kegagalan
- Menulis kode produksi dalam Java dan Python
- Membangun dan mengoperasikan pipeline ETL/ELT untuk ekstraksi dan transformasi data skala besar
- Bekerja dengan infrastruktur cloud di GCP dan AWS, terutama di GKE
- Meningkatkan observabilitas, keandalan, dan keunggulan operasional di seluruh sistem yang Anda kontribusikan
- Bekerja sama dengan tim produk dan data science untuk menyampaikan solusi yang berdampak
Persyaratan
- Pengalaman dalam menulis kode produksi dalam Java dan Python
- Pengalaman dalam membangun dan mengoperasikan pipeline ETL/ELT
- Pengalaman dalam bekerja dengan infrastruktur cloud di GCP dan AWS
- Kemampuan dalam merancang sistem yang dapat diskalakan dan tahan terhadap kegagalan
Skills Wajib
Konteks Indonesia
- Overlap Jam Kerja:
- Fleksibel — atur jam kerjamu sendiri
Keywords
Lihat Deskripsi Asli dari Greenhouse Boards
Deskripsi asli dari Greenhouse Boards
<div class="content-intro"><p>ZoomInfo is where careers accelerate. We move fast, think boldly, and empower you to do the best work of your life. You’ll be surrounded by teammates who care deeply, challenge each other, and celebrate wins. With tools that amplify your impact and a culture that backs your ambition, you won’t just contribute. You’ll make things happen–fast.</p></div><h3><strong>The Opportunity</strong></h3> <p>We're looking for a Senior Software Engineer to join our Web Data team and help build the next generation of ZoomInfo's web crawling and data extraction infrastructure.</p> <p>This is a hands-on engineering role with high impact. You'll work alongside experienced engineers building large-scale crawling and extraction systems that process billions of pages. Your day-to-day will involve solving real distributed systems problems, writing production code, and shipping features that directly impact data quality across the platform.</p> <p>You'll partner with a dedicated people manager who handles HR and administrative responsibilities, a product manager who connects business needs with technical work, and a senior manager who removes roadblocks and supports career growth. Your focus is on engineering execution, technical excellence, and collaboration.</p> <h3><strong>What You'll Do</strong></h3> <p>As a Senior Software Engineer, you'll contribute to enterprise-scale crawling and extraction platforms that process massive volumes of web data.</p> <p>You will:</p> <ul> <li>Design and implement components of scalable, fault-tolerant web crawling and extraction pipelines</li> <li>Write clean, production-grade code in Java and Python</li> <li>Build and operate ETL/ELT pipelines for large-scale data extraction and transformation</li> <li>Work with cloud infrastructure on GCP and AWS, primarily on GKE</li> <li>Improve observability, reliability, and operational excellence across the systems you contribute to</li> <li>Partner with product and data science teams to deliver impactful solutions</li> <li>Contribute to code reviews, documentation, and knowledge sharing across the team</li> <li>Stay current with evolving web technologies, anti-crawling mechanisms, and AI-powered extraction approaches</li> </ul> <h3><strong>Must-Have Qualifications</strong></h3> <p><strong>We're prioritizing strong software engineering fundamentals over deep crawling-specific experience. The right candidate is a great engineer first; the domain can be learned on the team.</strong></p> <h3><strong>Software Engineering Fundamentals</strong></h3> <ul> <li>5+ years of professional software engineering experience building production systems</li> <li>Strong CS fundamentals: algorithms, data structures, concurrency, distributed systems</li> <li>Proficiency in Java and/or Python</li> <li>Track record of owning features end-to-end from design through deployment and operation</li> <li>Comfortable making sound architectural decisions at the component level</li> </ul> <h3><strong>Data Engineering</strong></h3> <ul> <li>Hands-on experience with cloud data warehouses such as BigQuery or Snowflake</li> <li>Experience designing and operating large-scale ETL/ELT pipelines</li> <li>Experience with orchestration tools such as Apache Airflow</li> <li>Experience with streaming or event-driven systems such as Apache Kafka</li> </ul> <h3><strong>Cloud and Infrastructure</strong></h3> <ul> <li>Production experience on GCP (preferred) or AWS; multi-cloud exposure is a plus</li> <li>Hands-on experience with Kubernetes (GKE/EKS) for distributed workloads</li> <li>Familiarity with infrastructure-as-code tooling such as Terraform</li> </ul> <h3><strong>Background and Mindset</strong></h3> <ul> <li>Strong communicator who can explain technical decisions clearly</li> <li>Comfortable operating in ambiguity and iterating quickly</li> <li>Bias toward action and pragmatic problem solving</li> <li>Self-starter who thrives in fast-paced, evolving environments</li> </ul> <h3><strong>Nice to Have</strong></h3> <ul> <li>Experience with web crawling at scale (Scrapy or similar frameworks)</li> <li>Familiarity with proxy infrastructure, rotation strategies, or anti-bot evasion techniques</li> <li>Experience in extracting structured and unstructured web data from diverse site architectures</li> <li>Knowledge of SERP (Search Engine Results Page) extraction</li> <li>Comfort with AI/LLM-based extraction approaches, applying language models to HTML at scale</li> <li>Experience working in a B2B data company or data-as-a-product environment</li> </ul> <h3><strong>Core Technical Stack</strong></h3> <ul> <li>Java and Python</li> <li>Apache Kafka</li> <li>GCP (BigQuery, GKE, Vertex AI)</li> <li>Snowflake and Starburst/Trino</li> <li>Terraform</li> <li>Scrapy and web scraping frameworks</li> <li>Proxy management systems</li> <li>Distributed systems and Kubernetes</li> <li>Apache Airflow</li> <li>Large-scale ETL pipelines</li> </ul> <h3><strong>Ideal Profile</strong></h3> <ul> <li>5+ years of software engineering experience</li> <li>Experience operating systems that process meaningful volumes of data</li> <li>Strong CS fundamentals (algorithms, data structures, distributed systems)</li> <li>Excellent communicator who can explain complex ideas to diverse audiences</li> <li>Passion for solving hard problems and building elegant, scalable systems</li> <li>Self-starter who thrives in fast-paced, evolving environments</li> <li>Experience working in a B2B data company or data-as-a-product environment is a strong plus</li> </ul> <h3><strong>Why This Role Matters</strong></h3> <p>This role sits at the heart of ZoomInfo's data platform. You'll contribute to the infrastructure that powers our business intelligence products and help shape how web data acquisition is done at ZoomInfo, while working at a massive scale with cutting-edge technologies.</p> <p>You'll be joining at a pivotal moment: the team is growing, the ambition is high, and there's a real opportunity to make a lasting impact and grow into deeper technical leadership over time.</p> <p> </p> <p>#LI-AR2</p> <p>#LI-REMOTE</p><div class="content-pay-transparency"><div class="pay-input"><div class="description"><p>Actual compensation offered will be based on factors such as the candidate’s work location, qualifications, skills, experience and/or training. Your recruiter can share more information about the specific salary range for your desired work location during the hiring process. We want our employees and their families to thrive.</p> <p>In addition to comprehensive benefits we offer holistic mind, body and lifestyle programs designed for overall well-being. Learn more about ZoomInfo benefits <a class="c-link" href="https://www.zoominfo.com/careers#benefits" target="_blank" data-stringify-link="https://www.zoominfo.com/careers#benefits" data-sk="tooltip_parent">here</a>.</p></div><div class="title">Below is the US base salary for this position. Additional compensation such as Bonus, Commission, Equity and other benefits may also apply.</div><div class="pay-range"><span>$121,100</span><span class="divider">—</span><span>$190,300 USD</span></div></div></div><div class="content-conclusion"><p><strong>About us:</strong> </p> <p>ZoomInfo (NASDAQ: GTM) is the Go-To-Market Intelligence Platform that empowers businesses to grow faster with AI-ready insights, trusted data, and advanced automation. Its solutions provide more than 35,000 companies worldwide with a complete view of their customers, making every seller their best seller.</p> <p>ZoomInfo is committed to protecting your privacy when you apply for jobs with us. Please review our Job Applicant <a href="https://drive.google.com/file/d/1yiG-k0YX_sW10PiJk_xliDc3W-veJCrK/view?usp=drive_link" target="_blank">Privacy Notice</a> for more details on how we handle your personal information.</p> <p>ZoomInfo may use a software-based assessment as part of the recruitment process. More information about this tool, including the results of the most recent bias audit, is available <a href="https://www.zoominfo.com/legal/nyc-local-law-144-notice" target="_blank">here</a>.</p> <p>ZoomInfo is proud to be an equal opportunity employer, hiring based on qualifications, merit, and business needs, and does not discriminate based on protected status. We welcome all applicants and are committed to providing equal employment opportunities regardless of sex, race, age, color, national origin, sexual orientation, gender identity, marital status, disability status, religion, protected military or veteran status, medical condition, or any other characteristic protected by applicable law. We also consider qualified candidates with criminal histories in accordance with legal requirements.</p> <p>For Massachusetts Applicants: It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability. ZoomInfo does not administer lie detector tests to applicants in any location.</p></div>
Data & laporan pasar
Riset gaji & permintaan skill dari data lowongan kami sendiri.
- Lowongan IT Indonesia vs Remote Global (2026)Analisis data primer 2.049 lowongan: metodologi, klasifikasi, dataset bisa diunduh.
- Permintaan Skill AI: Indonesia vs Global (2026)10.000+ lowongan, classifier taxonomy-first, Wilson CI, pra-registrasi sebelum analisis.
- Laporan Hiring Indonesia: Tech vs Non-TechPermintaan lowongan per bidang dari hitungan agregat — bukan listing per-listing.
- Benchmark Gaji IndonesiaKisaran gaji agregat lintas peran, dengan metodologi dan dataset terbuka.
- Laporan Pasar Remote per PeranLaporan otomatis per kelompok peran — skill, senioritas, perusahaan, gaji.
