Previous Job
Site Reliability Engineering
Ref No.: 17-04172
Location: ORLANDO, Florida
W2 ONLY !!!!!!!!!!!

Site Reliability Engineering (SRE) is an engineering discipline that combines software engineering and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. An SRE within the Engineering Excellence team will focus on increasing our tooling and automation and improving our systems availability.

Build tools to quickly triage issues and Client failures across hardware, software, applications and network
In-depth analysis of service trends and implements adjustments to mitigate risk and prevent issue recurrence
Maintain production systems by measuring and monitoring availability, latency and overall system health
Provide guidance to software engineers related to design patterns that are resistant to failure
Support 24x7 on-call response to critical operational issues

Basic Qualifications
  • Strong technical knowledge of digital environment full stack including Mobile, Web, APIs, Messaging, Databases, Networks and their interactions
  • Knowledge and understanding of the SDLC principals and key controls
  • Experience working with and contributing to open source code or frameworks using Git version control
  • Strong knowledge of AWS Cloud solutions and product offerings
  • Experience with container technologies (i.e. Docker, Kubernetes)
  • Strong understanding of monitoring methodologies and proactive monitoring using APM (i.e. AppDynamics, New Relic) solutions or other monitoring and instrumentation technologies
  • Required knowledge and understanding of technical architecture, application systems design and integration in a large heterogeneous enterprise environment with hands on experience in SOA, Angular/Node, Java/J2EE, Oracle or MySQL/MariaDB programming methodologies
  • Experience working in an Agile environment (i.e. Scrum, Kanban)
Preferred Qualifications
  • 3+ years programming in one or more of: Java, Node, Python, Perl or C
  • 2+ years UNIX systems knowledge and/or systems administration background
  • Interest in designing, analyzing and troubleshooting large-scale distributed systems
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
  • Experience debugging, optimizing code and automating routine tasks