Previous Job
Previous
Senior Data Engineer
Ref No.: 19-01526
Location: Summit, New Jersey
Position Type:Contract
Duration: 6+ months
Compensation: up to $135/hr (W2)


*** U.S. Citizens and those authorized to work in the U.S. are encouraged to apply. We are unable to sponsor at this time. ***

Responsibilities:
  • Define requirements for the ingestion of new data sources including lifecycle, data quality check, transformations and metadata enrichment.
  • Define data transformation rules for integrating data sources into common / standard data models. (e.g. OMOP).
  • Analyze legacy data pipelines and assist team with migrating them to the Big Data platform.
  • Execute appropriate processes and build automation to manage quality and lifecycle of the data and metadata.
  • Develop / contribute to logical and physical data models for the data integrated into the platform.
  • Provide subject-matter-expertise to Data Scientists and Solution Architects who want to leverage data available in the Data Lake.
  • Create sample code, document best practices, contribute to practitioner guides, and provide ad-hoc training to data scientists and data analysts learning how to take advantage of the Big Data tools.
Requirements:
  • 5+ years of hands-on experience in Data Integration, ETL, and/or Data Engineering.
  • 5+ years of hands-on experience with Big Data tools and techniques.
  • 3+ years of hands-on experience with Talend used in conjunction with Hadoop MapReduce/Spark/Hive.
  • 5+ years of combined hands-on experience with various data modeling types including conceptual, logical, relational, dimensional, canonical (messaging, XML, JSON).
  • Strong experience with NoSQL databases.
  • Ability and willingness to be hands-on including use of SQL, ETL tools, scripts, etc. to tune performance, data transformation routines, etc. when necessary.
  • Creating data flow documentation, business process modeling as well as data integration design and development.
  • Hands-on experience with Hadoop ecosystem (Cloudera preferred) including MapReduce, Spark, Hive/Impala, HBase.
  • Hands-on experience with programming languages such as SQL, python, Scala, Java, R or Unix shell scripts to build workloads on Hadoop platforms.
  • Experience with Machine Learning applied to data ingestion is a plus.
  • Must have experience with Data Quality profiling/management tools such as Informatica IDQ, or Talend Data Quality.
  • Should have a basic understanding with reporting (visualization tools) such as Tableau, Qlik, etc.