Previous Job
Big Data Architect
Ref No.: 17-00538
Location: Austin, Texas
Position Type:Contract
Start Date: 10/09/2017
The Big Data Architect will work directly with key members of the architect and project teams to build and help implement big data designs in a Hadoop environment. The architect has responsibility for helping define data flow, data ingestion, migration, mapping, retention, storage, capacity related to application and data usage requirements. From a business systems perspective, the big data architect will have responsibility to understand key business requirements and translate that to data collection, ingestion, distribution, and consumption designs to support the development of an enterprise data lake. Must have hands-on experience with big data designs and Hadoop applications covering the entire big data ecosystem such as data ingestion, processing, security, administration, configuration management, monitoring, debugging, and performance tuning.

Primary Responsibilities and Essential Functions
  • This person will be responsible for helping in the design and development of key components that make up the Hadoop solution and will be involved in the full development lifecycle. This includes providing input into requirements, platform development, design of the project level technical architecture, big data application design and development, testing, and deployment of the proposed solution. The big data architect will work with a team of developers, analysts, and enterprise architects. It is expected that this person shall have a good understanding of data lake and data warehousing architectures for many types of data such as geospatial, structured, and unstructured. Prior experience working with geospatial data ingestion into Hadoop will be helpful in performing and leading the technical development team.
  • 8+ years' experience working as a data developer and architect with expertise in Hortonworks Hadoop, mapreduce, Hive, Hbase, Kafka, Oozie, Nifi, Java, and Scala. Experience working with ETL tools such as Informatica, Talend, or Pentaho is a plus. As a part of the big data development team, the big data architect will work to deliver data ingestion from various sources and have a good understanding of how to handle many various database, file, batch, transactional, and streaming data sources. Prior experience building data ingestion solutions using both Nifi and Kafka.
  • This role requires a good understanding of cluster and parallel architectures and good knowledge on big data platforms implemented in the cloud on AWS or Azure. Additionally, the role requires good experience handling data security and privacy. Experience with Ranger, Knox, and HDFS encryption.
  • Key Responsibilities:
  • Big data architecture and mentoring in big data / hadoop development
  • Design and build ingestion procedures into Hive from relational sources using sqoop
  • Utilize and build data flows and ingestion of data using Oozie, Nifi, and Kafka
  • Participate in gathering project data requirements with business analysts
  • Determine solutions for data ingestion, data migration, application data, and data feeds that meet specific project requirements
  • Design and develop ETL/ELT data flows and data models to support specific data requirements
  • Document designs in hadoop data ingestion, storage, data migration, and application data dictionary
  • Collaboration with enterprise architects, analysts, and project management

  • 8+ years working, designing, and engineering solutions in data management and data warehousing using Hadoop.
  • Good written and verbal communication skills
  • Experience in sqoop, hive, kafka, nifi, oozie, java, spark.
  • Expertise in big data application data architecture and supporting implementation
  • Big data design and big data team leadership experience
  • Understanding of source to target mapping
  • Experience in capacity planning and big data storage for cloud based environments.
  • Understanding of enterprise service bus architectures and rest services using Kafka or compatible solution
  • Integration of data using a publish / subscribe model
  • Involvement in design of bi-directional data movement
  • Past or current experience in ETL development using Informatica or other compatible ETL solution is a plus
  • Understanding of geospatial data integration
  • Data modeling for data lake, data warehousing, OLTP, normalized models, de-normalized models, dimensional models.
  • Ability to clearly articulate pros and cons of various technologies and platforms
  • Ability to document use cases, solutions and recommendations;