Pre-Registration

Hypotheses, operational definitions, exclusion rules, and statistical methods were committed to the repository before any results were computed. The pre-registration document is version-controlled at scripts/research/indonesia-it-vs-global-v2-2026-06/PRE-REGISTRATION.md. This is the falsifiability anchor v1 lacked.

Six hypotheses were pre-declared (H1–H6). Each maps to a two-proportion z-test at α = 0.05. Results in the article report which hypotheses were supported and which were not.

Dataset Overview

Dataset	Boards	Period	Notes
Indonesia local	JobStreet ID, Loker.id, Glints, Kalibrr	June 2026	One-time scrape, raw archived
Global remote	Contra, WWR, RemoteOK, Remotive, HN, Adzuna, The Muse + others	June 2026 snapshot	Live D1 corpus, frozen export

Classifier Architecture

Taxonomy-first approach

The classifier is taxonomy-first: deterministic regex banks do the bulk of the work. An LLM pass fires only for postings where signals are absent or conflicting for task_altitude and seniority — the two dimensions where short or ambiguous titles can leave the regex banks without a clear signal. Tool/language/AI-skill counts are pure presence matches and never need LLM resolution.

The taxonomy version v2-2026-06 is baked into every classified row. Re-running pnpm research:v2:analyze from the frozen dataset.jsonl reproduces every published figure exactly.

AI-skill taxonomy (six dimensions)

Dimension	Representative anchors
Agent orchestration	LangChain, AutoGen, CrewAI, LlamaIndex, Haystack, multi-agent
Prompt engineering	prompt engineering, system prompts, few-shot, chain-of-thought
Eval / testing AI	LLM evaluation, RLHF, model benchmarking, red-teaming, hallucination
RAG & vector DBs	RAG, retrieval-augmented generation, Pinecone, Weaviate, pgvector, Qdrant
MLOps & inference	vLLM, BentoML, Ray Serve, ONNX, MLflow, LoRA, model serving
AI governance	responsible AI, AI ethics, model alignment, guardrails, EU AI Act

Classifier accuracy disclosure

Taxonomy version: v2-2026-06
Gold-set size: 200 stratified, hand-labeled rows (100 Indonesia + 100 global; balanced across seniority, task altitude, and AI-skill presence)
CI gate: overall macro-F1 ≥ 0.80 on the deterministic path (tests/unit/research/classify-posting.test.ts); failing this gate blocks the push.
LLM-path accuracy: measured offline; % of rows resolved by LLM reported in analysis-report.md.

Statistical Methods

Every proportion reported in the article carries a Wilson 95% confidence interval. The Wilson interval is appropriate for proportions near 0 or 1 where the normal approximation is poor. It is hand-computed in scripts/research-v2-analyze.ts — no external stats library (worker bundle budget).

Statistical tests use a two-proportion z-test (two-tailed, α = 0.05). Cells with n < 30 are flagged "directional only" and excluded from hypothesis conclusions. Results are observational — no causal claims.

Exclusion Rules

Blue-collar: isBlueCollarTitle(title) — physical-labour roles identified by title patterns (driver, chef, security guard, cleaning, waiter).
Non-IT titles: Sales executive, cashier, accounting, HR generalist, content writer, and similar non-technical roles identified by title patterns.
Empty titles:Rows with title length < 2.
Salary: Excluded from all analysis. Disclosure asymmetry (Indonesia rarely publishes salary ranges; global sources vary) would confound any cross-market comparison.

Reproducibility

The full pipeline is committed to the repository and can be re-run from scratch:

pnpm research:v2:export — export global D1 slice
pnpm exec tsx scripts/research/indonesia-it-vs-global-v2-2026-06/parse.ts — parse Indonesia scrape
pnpm research:v2:build — merge + classify → dataset.jsonl
pnpm research:v2:analyze — analysis report + public CSVs

The Indonesia raw scraped files, parser, gold set, dataset, and CSVs are all committed in the same PR per the Research Retention SOP.

Limitations

Non-probability sample. Indonesia-local data comes from listing-page scrapes, not a random sample of all Indonesian IT employers.
Listing ≠ hire. A posting reflects stated demand, not actual hiring outcomes.
Description quality varies. Indonesia-local listings are often shorter; regex classifiers may under-detect AI skills in brief postings (recall bias against Indonesia — conservative for H1).
Single point in time. Snapshot from June 2026; AI tool adoption is changing rapidly.
Classifier errors. Overall macro-F1 floor of 0.80 means roughly 20% of cells may be misclassified at the individual-row level. Aggregate proportions are more reliable than individual classifications.

Version History

Version	Date	Change
v1	May 2026	199 Indonesia + 1,010 global. No CIs, no pre-registration, parser lost in /tmp.
v2	June 2026	10,000+ postings. Taxonomy-first classifier. Gold-set validation. Wilson CIs. Two-proportion z-tests. Pre-registered. All raw data and parser committed.

Questions or corrections: contact us. Source code: github.com/kelvindesman/lokerdollar.com.