Job Description
Job Title: ML/Data Engineer (AWS, Domino & SageMaker)
Location: Reston, VA
Job Type: Full-Time
Job Summary
We are seeking a highly skilled ML/Data Engineer with strong expertise in AWS Machine Learning, Domino, Amazon SageMaker, and MLOps. The ideal candidate will be responsible for designing, building, deploying, and maintaining scalable machine learning solutions while developing robust data pipelines that support the entire ML lifecycle. This role requires close collaboration with Data Scientists, Software Engineers, and Governance teams to ensure production-ready, compliant, and high-performing ML models.
Key Responsibilities
- Design, build, deploy, monitor, and maintain machine learning models across Domino and Amazon SageMaker.
- Implement and manage MLflow for experiment tracking, model versioning, artifact management, metrics, and end-to-end lineage.
- Develop scalable data pipelines for data ingestion, feature engineering, model training, validation, and inference.
- Package, deploy, and manage ML models throughout development, testing, staging, and production environments.
- Develop custom model evaluation metrics, explainability frameworks, and bias/fairness testing solutions.
- Monitor model performance and ensure continuous improvement through retraining and lifecycle management.
- Collaborate with Data Scientists, Data Engineers, Software Engineers, and Governance teams to deliver production-ready ML solutions.
- Design and optimize relational and NoSQL data models to support machine learning workloads.
- Normalize databases and ensure data structures meet application and analytics requirements.
- Build datasets by combining raw data from multiple sources into clean, consistent, and machine-readable formats.
- Implement Git-based version control and CI/CD practices for ML and data engineering workflows.
- Ensure compliance with enterprise governance, security, and MLOps best practices.
Required Skills & Experience
Machine Learning & MLOps
- Strong experience with AWS Machine Learning Services
- Hands-on experience with Amazon SageMaker
- Experience with Domino Data Lab
- Proficiency in Python
- Experience implementing MLflow
- Model deployment and lifecycle management
- Model monitoring and performance optimization
- Model explainability and interpretability
- Bias and fairness testing
- Feature engineering
Data Engineering
- Building scalable data pipelines
- Data ingestion and transformation
- SQL and advanced database development
- Data modeling
- Database normalization
- ETL/ELT development
- Experience with relational databases and NoSQL databases
- Data lakes and modern data platforms
Big Data Technologies
- Apache Spark
- Apache Hive
- Apache Airflow
Cloud & DevOps
- AWS Cloud
- Git
- Version Control
- CI/CD
- MLOps best practices
Preferred Qualifications
- Experience with enterprise-scale machine learning platforms.
- Experience working with Data Science and AI teams.
- Knowledge of model governance and regulatory compliance.
- Familiarity with distributed computing frameworks.
Preferred Experience
- Experience building production-grade ML solutions on AWS.
- Strong understanding of scalable data architecture and modern data engineering practices.
