Previous Job
Previous
Big Data/Cloud Software Engineer
Ref No.: 17-00648
Location: Cranberry Township, Pennsylvania
Work with a team of engineers on our cloud data platform that streams data from a variety of health care software and hardware systems in real-time to create transformational recommendations for our customers. Our solutions help to drive improved financial performance, compliance, and better patient outcomes. Each day you will make an impact.
Responsibilities:
Defines technology roadmap in support of product development roadmap
Architects complex solutions encompassing multiple product lines
Provides technical consulting to multiple product development teams
Develop custom batch-oriented and real-time streaming data pipelines working within the MapReduce ecosystem, migrating flows from ELT to ETL
Ensure proper data governance policies are followed by implementing or validating data lineage, quality checks, classification, etc.
Act in a technical leadership capacity: Mentor junior engineers and new team members, and apply technical expertise to challenging programming and design problems
Resolve defects/bugs during QA testing, pre-production, production, and post-release patches
Have a quality mindset, squash bugs with a passion, and work hard to prevent them in the first place through unit testing, test-driven development, version control, continuous integration and deployment.
Ability to lead change, be bold, and have the ability to innovate and challenge status quo
Be passionate about solving customer problems and develop solutions that result in a passionate customer/community following
Conduct design and code reviews
Analyze and improve efficiency, scalability, and stability of various system resources
Contribute to the design and architecture of the project
Operate within Agile Development environment and apply the methodologies
Required Skills and Knowledge:
Advanced knowledge of data architectures, data pipelines, real time processing, streaming, networking, and security
Proficient understanding of distributed computing principles
Good knowledge of Big Data querying tools, such as Pig or Hive
Good understanding of Lambda Architecture, along with its advantages and drawbacks
Proficiency with MapReduce, HDFS
Experience with integration of data from multiple data sources
Basic Qualifications:
Bachelors Degree in Software Engineer or similar degree
12+ years experience in software engineering
Experience developing ETL processing flows using MapReduce technologies like Spark and Hadoop
Experience developing with ingestion and clustering frameworks such as Kafka, Zookeeper, YARN
Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
Experience with various messaging systems, such as Kafka or RabbitMQ
Preferred Experience:
Experience with DataBricks and Spark
Ability to solve any ongoing issues with operating the cluster
Management of Spark or Hadoop clusters, with all included services
Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
Experience with Big Data Client toolkits, such as Mahout, SparkML, or H2O
Understanding of Service Oriented Architecture
Technical writing, system documentation, design document-management skills