Previous
			 
		
		
		AWS Data Engineer with EMR
		
		
		
			Next
		 
		
	 
    
    
		
			| Ref No.: | 
			24-01077 | 
		 
	
		| Location:  | 
Iselin, New Jersey
 |  
		
		
		
		
	 
	 
	
    
    
	 
	- Data Pipeline Development: Design and implement robust ETL processes to extract, transform, and load data from various sources into data lakes and warehouses.
 
	- AWS EMR Clusters: Configure, manage, and optimize Amazon EMR clusters for big data processing using Apache Spark, Hive, or Presto.
 
	- Container Orchestration: Utilize Kubernetes for deploying, scaling, and managing containerized applications and services.
 
	- CI/CD Implementation: Develop and maintain CI/CD pipelines for automated deployment of data applications and services using tools like Jenkins, GitLab CI, or AWS CodePipeline.
 
	- SQL Development: Write complex SQL queries for data manipulation and retrieval, ensuring high performance and scalability.
 
	- Data Quality and Governance: Implement data quality checks, monitoring, and logging mechanisms to ensure data reliability and compliance.
 
	- Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet business needs.
 
	- Documentation: Maintain comprehensive documentation of data architecture, processes, and workflows.
 
 
Qualifications:
	- Education: Bachelor's degree in Computer Science, Data Science, Information Technology, or a related field.
 
	- Experience: 3+ years of experience in data engineering or a related role, with a focus on AWS technologies.
 
	- AWS Expertise: Proficient in AWS services such as EMR, S3, RDS, Redshift, Lambda, and CloudFormation.
 
	- Kubernetes Knowledge: Experience with Kubernetes for container orchestration and microservices architecture.
 
	- CI/CD Tools: Familiarity with CI/CD tools and practices, including version control using Git.
 
	- SQL Proficiency: Strong knowledge of SQL, with experience in relational databases (e.g., MySQL, PostgreSQL) and data warehousing solutions.
 
	- Programming Skills: Proficiency in programming languages such as Python, Java, or Scala for data processing tasks.
 
	- Problem-Solving Skills: Ability to troubleshoot complex data issues and optimize performance.
 
	- Communication: Excellent communication skills with the ability to work collaboratively in a team environment.
 
 
 
	 
	 
     
	
	
	
    
 |