We are seeking a skilled Data Scientist and ML Engineer to develop and deploy generative AI models including diffusion models and LLMs. You will design synthetic data pipelines, fine-tune models, and collaborate with cross-functional teams. The role requires 2+ years of experience with Python and PyTorch.
Responsibilities
Design and implement LLM-driven synthetic data pipelines for high-quality data generation
Develop, fine-tune, and deploy machine learning models with a focus on generative AI (diffusion models, LLMs)
Build and maintain scalable data pipelines for training, evaluation, and inference
Conduct exploratory data analysis to identify opportunities for model improvement
Collaborate with researchers, engineers, and data program managers to deliver ML solutions
Enhance internal tools and automation workflows to accelerate experimentation
Requirements
Bachelor's degree in Computer Science or related field from an accredited U.S. institution
2+ years of experience in Machine Learning or Software Engineering
Expert-level proficiency in Python and familiarity with deep learning frameworks such as PyTorch
Strong foundation in machine learning algorithms, data preprocessing, and evaluation techniques
Demonstrated experience with diffusion models, stable diffusion, or large language models (LLMs)