Previous Job
Data Engineer
Ref No.: 19-10019
Location: Dallas, Texas

Position: Data Engineer

Location: Dallas, TX

Duration: FTE

Job Description:

  • Define needs around maintainability, testability, performance, security, quality and usability for data platform
  • Drive implementation, consistent patterns, reusable components, and coding standards for data engineering processes
  • Work with the Business Analysts and Customers throughout the requirements process to properly understand the long term goals of the program and where they fit in the overall UI infrastructure
  • Communication of new technologies, best practices, etc. to developers, testers, and managers.
  • Mentoring and peer review of designs and coded implementations
  • Work with technical specialists (Security Team, Performance Engineer, etc.) to ensure that all parties understand the system that is being designed and built and that all major issues are understood and mitigated.
  • Expected to participate in several implementation phases of product development cycle – design, scoping, planning, implementation and test.
  • Integral team member of our AI and Analytics team responsible for design and development of Big data solutions Partner with domain experts, product managers, analyst, and data scientists to develop Big Data pipelines in Hadoop or Google Cloud Platform Responsible for delivering data as a service framework from Google Cloud Platform
  • Responsible for moving all legacy workloads to cloud platform
  • Work with data scientist to build Client pipelines using heterogeneous sources and provide engineering services for data science applications
  • Ensure automation through CI/CD across platforms both in cloud and on-premises
  • Ability to research and assess open source technologies and components to recommend and integrate into the design and implementation
  • Be the technical expert and mentor other team members on Big Data and Cloud Tech stacks
  • Some of the best practices supported include but are not limited to:
  • Achieving 85% code coverage with use of TDD (Test Driven Development),
  • Leveraging automated testing on 100% of API code
  • Leveraging automated testing for Continuous Improvement Continuous Development (functional & performance)
  • Ensuring frequent check in of code and supporting Peer Code Reviews

Required skills:

  • 5+ years of experience with Hadoop (Cloudera) or Cloud Technologies Expert
  • level building pipelines using Apache Beam or Spark Familiarity with core provider services from AWS, Azure or GCP, preferably having supported deployments on one or more of these platforms
  • Experience with all aspects of DevOps (source control, continuous integration, deployments, etc.)
  • Experience with containerization and related technologies (e.g. Docker, Kubernetes)
  • Experience in other open-sources like Druid, Elastic Search, Logstash etc is a plus
  • Advanced knowledge of the Hadoop ecosystem and Big Data technologies Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr)
  • Knowledge of agile(scrum) development methodology is a plus
  • Strong development/automation skills
  • Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus.
  • System level understanding - Data structures, algorithms, distributed storage & compute
  • Can-do attitude on solving complex business problems, good interpersonal and teamwork skills
  • Support of a highly distributed, scalable ETL processes - Sourcing, Data Enrichments and data delivery using Ab Initio ETL tool.
  • Hands on ETL production support (data transformations and data movement) using Ab Initio, Ab Initio Express IT, EDQE (Data Quality) and Enterprise Metadata Hub as key tools


  • Angular.JS 4 Development and React.JS Development expertise in a up to date Java Development Environment with Cloud Technologies
  • Exposure and/or development experience in Microservices Architectures best practices, Java Spring Boot Framework (Preferred), Docker, Kubernetes
  • Exposure to other data pipeline
  • Experience around REST APIs, services, and API authentication schemes
  • Knowledge in RDBMS and NoSQL technologies
  • Exposure to multiple programming languages
  • Knowledge of modern CI/CD, TDD, Frequent Release Technologies and Processes (Docker, Kubernetes, Jenkins)
  • Exposure to mobile programming will be a plus.