Machine Learning Engineer — Multilingual Data
About This Gig
We’re looking for a Machine Learning Engineer to own and scale our multilingual data pipeline —from sourcing and curation to evaluation and continuous improvement. You’ll work closely with researchers and infra engineers to ensure our models perform robustly across languages, scripts, and cultural contexts. This role sits at the intersection of data, research, and production ML and is ideal for someone who cares deeply about data quality, linguistic diversity, and model generalization beyond English. What You’ll Do Design, build, and maintain large-scale multilingual datasets across high- and low-resource languages Develop data pipelines for collection, cleaning, normalization, deduplication, and labeling Implement quality filters using statistical, heuristic, and model-based methods Work with researchers to define language coverage, benchmarks, and evaluation metrics Analyze dataset bias, coverage gaps, and failure modes across regions and scripts Support training, fine-tuning, and d
Skills & Tags
About the Seller
Featherless AI
on Himalayas