Previous Job
Systems Administrator Specialist
Ref No.: 18-01346
Location: Carteret, New Jersey
General Summary:
The scope of this position is to provide first line systems support for production systems. The Site Reliability (SRE) Team responsibilities range from entry level to high level server support. The applicant will have responsibility for ensuring effective monitoring, maintenance, and troubleshooting of production equipment in order to achieve maximum uptime. Innovative ideas and incorporating automation in order to improve on existing processes and procedures to maximize efficiency are encouraged. It is critical that the candidate have potential and drive to grow their technical expertise and advance within the group over time. The work environment is fast paced and employees are expected to take on tasks independently.

Job Functions:
The candidate must be a very organized and detail oriented individual that will be part of a support team responsible for basic hardware and OS support pertaining to enterprise servers. Commitment to excellence, attention to detail and organizational skills are critical components for this job. Candidate needs to understand and be able to solve a wide array of problems with all types of hardware on the fly or in an emergency.
  • 1st level of response and troubleshooting of alarms using Client OpenView
  • Break/Fix hardware maintenance on mission critical systems (BIOS updates, hard drives, memory, RAID controller)
  • Troubleshoot file system, memory, and CPU utilization alarms on server systems (RHEL, CentOS and Windows)
  • Automating tasks using Python, Java, Bash or any other programming language
  • Server patching on weekends for both Linux and Windows
  • Basic network stack support
  • Build and deploy new servers
  • Provide assistance and escalate as needed to other operations and engineering teams to ensure that all systems are available and working properly.
  • Inventory control for servers, components and connectivity using DCIM software
  • Use of an automated ticket system to organize work requests
  • Develop detailed documentation of knowledge base, best practices, and procedures
  • Proactively and independently work on projects as needed
  • Perform weekend checks and server patching on a rotating basis

  • Bachelor's degree in an IT related field, or equivalent in experience
  • RHCSA or MCSE certification
  • Knowledge of Windows and Linux server administration
  • Must be able to identify and remediate hardware issues using iDRAC, iLO, etc
  • Familiarity with a programming language such as Python, Java, or Bash
  • Experience with automation tools such as Puppet and Ansible is a plus
  • Familiarity with BMC Bladelogic
  • Working knowledge of server and networking hardware
  • Excellent written and verbal communication skills (MS Office experience)
  • Self-motivated, reliable team player
  • Consistently work to enhance system performance and systems availability
  • Knowledge of financial trading systems is a plus