Big Data Engineer

Search for More Jobs

Forward job to a friend

Quick Apply

Apply by creating/using an account

Language

Big Data Engineer

Ref No.:	25-00839
Location:	Malvern, Pennsylvania
Position Type:	Contract

Job Description:
· We are seeking an experienced Machine Learning Engineer to join our AI/ML Engineering team. You will be responsible for developing and optimizing complex data pipelines, integrating model pipelines, and building scalable AI/ML solutions, including large language models (LLMs). The ideal candidate will possess a robust background in traditional machine learning, deep learning, and significant experience with large datasets and cloud-based AI services.
· Develop and optimize complex data pipelines, applying machine learning engineering principles to enhance efficiency and scalability.
· Integrate and optimize data and model pipelines within production environments, diagnosing data inconsistencies and documenting assumptions.
· Collaborate with data science teams to review model-ready datasets and feature documentation, ensuring completeness and accuracy.
· Perform data discovery and analysis of raw data sources, applying business context to meet model development needs.
· Comfort with exploratory data exploration and tracking data lineage during inception or root cause analysis.
· Write and maintain model monitoring scripts, diagnosing issues and coordinating resolutions based on alerts.
Qualifications
· Around 8 years of relevant work experience.
· At least 6 years of hands-on experience designing ETL pipelines using AWS services (e.g., Glue, SageMaker).
· Proficiency in programming languages, particularly Python (including PySpark, PySQL) and familiarity with machine learning libraries and frameworks.
· Strong understanding of cloud technologies, including AWS and Azure.
· Experience with API design and development is a plus.
· Solid understanding of software engineering principles, including design patterns, testing, security, and version control.
· Familiarity with Feature Store usage, LLMs, GenAI, RAG, Prompt Engineering, and Model Evaluation.